fedora-meeting
LOGS
20:00:42 <mmcgrath> #startmeeting Infrastructure
20:00:42 <zodbot> Meeting started Thu May 27 20:00:42 2010 UTC.  The chair is mmcgrath. Information about MeetBot at http://wiki.debian.org/MeetBot.
20:00:42 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
20:00:45 <mmcgrath> #topic Who's here?
20:00:48 * ricky 
20:00:51 * nirik is lurking around.
20:00:52 <smooge> here
20:01:15 * abadger1999 here
20:01:33 * sgallagh lurking
20:02:32 <mmcgrath> Ok, lets get started
20:02:37 <mmcgrath> #topic Infrastructure Fedora 13 release
20:03:07 <mmcgrath> #link https://fedorahosted.org/fedora-infrastructure/report/9
20:03:13 <mmcgrath> lets go through and make sure these are all closed
20:03:17 <mmcgrath> .ticket 2137
20:03:18 <zodbot> mmcgrath: #2137 (New website) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2137
20:03:23 <mmcgrath> ricky: mind closing that one?  sijis has it currently
20:03:43 <ricky> Closed
20:03:56 <mmcgrath> .ticket 2138
20:03:57 <zodbot> mmcgrath: #2138 (Verify Mirror Space) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2138
20:04:00 <mmcgrath> smooge: mind closing that one?
20:04:02 <mmcgrath> .ticket 2139
20:04:03 <zodbot> mmcgrath: #2139 (Release day ticket.) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2139
20:04:07 <mmcgrath> this one's mine, closing
20:04:08 <smooge> done
20:04:24 <mmcgrath> .ticket 2141
20:04:25 <zodbot> mmcgrath: #2141 (Mirrormanager redirects) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2141
20:04:25 <CodeBlock> here
20:04:27 <mmcgrath> mdomsch: that one all done?
20:04:30 * mmcgrath assumes so
20:04:38 <mmcgrath> .ticket 2146
20:04:39 <zodbot> mmcgrath: #2146 (Enable wiki caching.) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2146
20:04:41 <mmcgrath> .ticket 2147
20:04:43 <zodbot> mmcgrath: #2147 (Disable wiki caching) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2147
20:04:46 <mmcgrath> smooge: those two done?
20:04:50 * mmcgrath assumes so :)
20:04:53 <mmcgrath> .ticket 2166
20:04:54 <zodbot> mmcgrath: #2166 (mirrors.fp.o/releases.txt) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2166
20:04:56 <smooge> yep. and should be closed
20:04:59 <mmcgrath> that one's done, closing now
20:05:17 <mmcgrath> And that leave just one more thing
20:05:20 <mmcgrath> .ticket 2145
20:05:21 <zodbot> mmcgrath: #2145 (Lessons Learned) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2145
20:05:37 <jokajak> sorry i'm late
20:05:42 <jokajak> i'm split between this and $dayjob
20:05:44 <mmcgrath> so all in all I think this release went well, we did launch pretty close to the target time.
20:05:47 <mmcgrath> jokajak: no worries.
20:05:55 <mmcgrath> ricky: how'd things go from your view with the website?
20:06:18 <ricky> Good, had to stay up the night before fixing up some docs stuff
20:06:31 <mmcgrath> yeah, I think docs was the one wildcard that wasn't great.
20:06:32 <ricky> Went smoothly on release day though, didn't delay release with the build for once :-)
20:06:40 <mmcgrath> but it didn't seem to impact users that much so not a big deal.
20:06:51 <mmcgrath> One thing we did see this time around that we had not seen in previous releases is http code 103.
20:06:59 <mdomsch> mmcgrath, yes
20:07:03 <mmcgrath> what is that you ask?  It's not well defined but related to apache and caching.
20:07:13 <mmcgrath> It's not clear to me what the users were seeing when issued a 103
20:07:22 <mmcgrath> I do know it generally happened with larger(ish) content like images.
20:07:30 <mmcgrath> I couldn't recreate it.
20:07:33 <ricky> Images on fp.o?
20:07:38 <mmcgrath> ricky:  yeah.
20:07:39 <ricky> Or do you mean ISO images?
20:07:42 <mmcgrath> pictures
20:07:44 <ricky> Yow, didn't know about that
20:07:55 <mmcgrath> it was reasonably rare compared to the number of served requests.
20:08:24 <mmcgrath> based on what I saw on the boxes, and from googling around my theory was that these were images in the process of loading that users either clicked the "stop" button killing the transfer.
20:08:35 <mmcgrath> or more likely, found the link they wanted before the whole page loaded.
20:08:48 <ricky> Ahh
20:09:05 <mmcgrath> but not being able to completely recreate it, it's hard to say.
20:09:44 <mmcgrath> anyone have any other questions or comments on this, the release, or anything?
20:09:55 <gholms> When I investigated a bunch of 103s happen for a while at work; that's what caused it.  Recreating it is more reliable when you have an extremely short trip between a testing script and the server.
20:10:10 <mmcgrath> gholms: <nod> good to know.
20:10:23 <mmcgrath> I suspect it also helps if the server's under a lot of load and is slower than normal to load those things.
20:10:50 <gholms> I would try doing some testing from a machine in the datacenter if you can; it might be more reliably reproduced.
20:10:59 <mmcgrath> <nod>
20:11:19 <mmcgrath> Ok.
20:11:21 <mmcgrath> so next topic
20:11:23 * ricky will write a test script in a bit
20:11:26 <mmcgrath> #topic CDN
20:11:39 <mmcgrath> Thanks to nb I think our dnssec issues are re-fixed.
20:11:48 <mmcgrath> I think the next step is going to be geodns implementation.
20:11:52 <mmcgrath> which we've been testing on publictest8.
20:12:19 <mmcgrath> The CDN though is going to be a much more detailed project requiring quite a bit of our time to get in place and maintain.
20:12:22 <mmcgrath> but I think it will be worth it.
20:12:30 <mmcgrath> our end users should see much better performance then previously.
20:13:11 <mmcgrath> Anyone have any questions on this?
20:13:16 <mmcgrath> concerns?
20:13:18 <mmcgrath> want to help out?
20:13:30 <smooge> one sec
20:13:48 <smooge> CDN?
20:13:51 <ricky> Any idea what kind of technologies this will involve other than geodns?
20:13:54 <mmcgrath> content distribution network.
20:13:59 <mmcgrath> ricky: everything we have now
20:14:06 <mmcgrath> geodns was the last bit that'd make it worth while.
20:14:16 <mmcgrath> the work though is going to be making sure our caching layer is functioning properly.
20:14:21 <smooge> so an opensource akamai?
20:14:22 <ricky> Ah, cool, I'd definitely be happy to help out
20:14:24 <mmcgrath> smooge: yeah
20:14:28 <mmcgrath> the big thing is metrics.
20:14:45 <mmcgrath> for example, we may want to look closely at serving static content from the proxy servers directly
20:14:56 <mmcgrath> like pkgdb's images and css.
20:15:15 <mmcgrath> we've had some issues in the past wrt caching our admin.fp.o content when someone is logged in.
20:15:35 <CodeBlock> I'd be happy to help out as well, just tell me what I'm doing
20:16:30 <mmcgrath> CodeBlock: k
20:16:37 <mmcgrath> does anyone have any experience actually setting these up?
20:17:05 <mdomsch> not for production use :-)
20:17:15 <mmcgrath> :)
20:17:20 <mmcgrath> well this will be an adventure for all of us.
20:17:30 <mmcgrath> k, moving on if no one has anything else.
20:17:42 <smooge> mmcgrath, no my experience has been more in breaking them
20:17:48 <abadger1999> I think we've fixed the caching/logged in problem
20:17:55 <mmcgrath> smooge: :-D  we'll need some of that too.
20:18:05 <abadger1999> (By setting no cookies on the particular content)
20:18:13 <mmcgrath> abadger1999: k, I may work with you on that to verify.  Because when that happens... that is some scary crap.
20:18:19 <mmcgrath> :)
20:18:20 <abadger1999> <nod>
20:18:26 <mmcgrath> we have a decent staging environment now too so that should help.
20:18:32 <mmcgrath> For those that don't know what I'm talking about....
20:18:59 <mmcgrath> when we first enabled caching on admin.fedoraproject.org.  Login cookies were getting cached.  So if toshio logged in before me.  Then I tried to log in. It's possible i'd find myself logged in as toshio.
20:19:14 <mmcgrath> Boy was that a fun day.
20:19:22 <ricky> Hehe
20:19:24 <mmcgrath> ok, anyone have any other questions or comments on that?
20:19:26 <skvidal> I wanna be toshio!
20:19:40 <mmcgrath> ok, that transitions into
20:19:45 <mmcgrath> #topic Internetx -- new sponsor
20:19:59 <mmcgrath> I'm happy to say we have a new machine in the EU (hydh hooked us up)
20:20:05 <skvidal> so are they all ipv6 all the time?
20:20:07 <mmcgrath> it has ipv6, good connection.
20:20:15 <skvidal> if so then it's nice to see puppet (and func) both work sanely
20:20:25 <mmcgrath> they have both ipv6 and ipv4.
20:20:37 <hydh> native ipv6
20:20:37 <smooge> skvidal, I tried to be toshio but my feet grew 3 feet
20:20:44 <hydh> and 1gbit uplink
20:20:45 <mmcgrath> I'm thinking at a minimum we're going to need to move noc2 out there.
20:20:54 <ricky> Yeees!
20:20:54 <smooge> yes
20:20:55 <mmcgrath> because with 2 ipv6 connections, it's time to start actually monitoring them.
20:21:06 <mmcgrath> and I'm honestly not sure if nagios supports it.
20:21:10 <mmcgrath> though I'd imagine it does.
20:21:35 <smooge> nagios 3 does.. [isn't that the default line for anything that nagios currently doesnt?]
20:21:41 <mmcgrath> hahahahah
20:21:46 <mmcgrath> does nagios bake cookies?
20:21:48 <mmcgrath> nagios 3 does...
20:21:55 <smooge> mmmmmm coookies
20:22:13 * gholms has a nagios plugin alerting him to lunch time as it approaches
20:22:14 <skvidal> mmcgrath: if you combine nagios 3 with butrfs it cures cancer!
20:22:19 <mmcgrath> Ok, so we'll be getting that brought online soon.
20:22:19 <rsc> mmcgrath: InterNetX from Germany is sponsoring Fedora?
20:22:21 <mmcgrath> smooge: Ponies!
20:22:31 <mmcgrath> rsc: yup
20:22:43 <mmcgrath> rsc: so we can finally close some of your "X is slow to load" tickets :)
20:22:46 <mmcgrath> ricky: how's proxy2 going btw?
20:22:57 <ricky> So far so good, done with puppet, getting it on func
20:23:08 <rsc> mmcgrath: that would be great. Because InterNetX has powerful infrastructure and even well IPv6 connectivity :)
20:23:11 <mmcgrath> ricky: ever figure out what was causing that error?
20:23:21 <ricky> Nope, but I'm definitely not done looking :-)
20:23:27 <mmcgrath> rsc: it's a done deal.  It's already handed over to us, we're just still in the process of building it :)
20:23:32 <smooge> ponies and cookies. can my day get any better
20:23:34 <mmcgrath> ricky: did you just end up using my.. eh hem... hack?
20:23:41 <ricky> Yeah
20:23:51 <ricky> Shocking how fast it still was..
20:25:03 <mmcgrath> yeah
20:25:07 <mmcgrath> anyone have anything else on this?
20:25:22 <smooge> not me
20:25:32 <mmcgrath> ok.  next one
20:25:34 <mmcgrath> #topic koji backups
20:25:48 <mmcgrath> so I've been working to move koji backups from the tape drives to dedicated storage.
20:25:57 <mmcgrath> in this case 6U of storage including a 2U server and 2 2U disk trays.
20:25:57 <skvidal> ricky: getting it on func? doesn't puppet do that automagically now?
20:26:04 <skvidal> ricky: it's the same cert
20:26:28 <ricky> skvidal: Thank you - you just saved me the time I was going spend debugging nothing :-)
20:26:31 <mmcgrath> It's been backing up for about a week and a half now.
20:26:46 <skvidal> ricky: if you login to puppet1
20:26:48 <skvidal> you can run
20:26:57 <smooge> is it because of disk speed OR just churn of whats in koji?
20:27:01 <skvidal> sudo func 'proxy02*' call test ping
20:27:09 <skvidal> ricky: which will tell you if it works
20:27:14 <mmcgrath> smooge: well it doesn't seem to be the disks on the backup server.  Must just be the /mnt/koji speed.
20:27:18 <skvidal> damn it
20:27:24 <skvidal> I need to write up the docs on using func in FI
20:27:26 * skvidal makes a note
20:27:44 <smooge> skvidal, the sticky note from last week fell off?
20:27:47 <mmcgrath> anyone have any questions or comments on that?
20:28:21 <mmcgrath> k
20:28:23 <mmcgrath> next topic
20:28:26 <mmcgrath> #topic /mnt/koji
20:28:31 <mmcgrath> from backups, onto the real deal.
20:28:33 <skvidal> smooge: be nice
20:28:36 <skvidal> :)
20:28:38 <mmcgrath> dgilmore: when do you want to move /mnt/koji over?
20:28:38 <dgilmore> mmcgrath: it took me over a week to rsync /mnt/koji/packages onto the equalogix
20:28:47 <mmcgrath> I assume we're going to want to wait until the other equalogicx is on site?
20:28:52 <mmcgrath> mdomsch: you know we own two of those now right?
20:28:55 <dgilmore> mmcgrath: need to do a full sync again
20:29:05 <dgilmore> mmcgrath: and i want to test my rm -rf
20:29:17 <mdomsch> mmcgrath, no, that's great!
20:29:17 <dgilmore> that will free up 1.2T or so
20:29:29 <mmcgrath> mdomsch: they're not both on site yet but they will be.
20:29:32 <dgilmore> mmcgrath: and id kinda like to wait till we get the second unit in place
20:29:34 <mmcgrath> dgilmore: I have some port concerns.
20:29:40 * CodeBlock needs to head out for a while, sorry. back later
20:29:42 <mmcgrath> dgilmore: makes me wonder if we can use crossover cables.
20:29:45 <mmcgrath> CodeBlock: no worries.
20:29:52 <dgilmore> mmcgrath: how we are using it we will only use a single port ever
20:30:11 <dgilmore> mmcgrath: it doesnt do port bonding
20:30:16 <mmcgrath> dgilmore: k, we may need to communicate that to the network team because last time I was there I swear we had like 6 ports plugged in.
20:30:25 <dgilmore> mmcgrath: with a single client we are really only using a single port
20:30:27 <mmcgrath> oh that's right, or it does do bonding, just not the type our network team can work with.
20:30:30 <mmcgrath> yeah.
20:30:40 <dgilmore> it doesnt do bonding period
20:31:02 <mmcgrath> dgilmore: it'd be nice to have the two units talking to eachother over a dedicated link though
20:31:06 <dgilmore> its designed to balnace client load by sending different clients to different ports
20:31:12 <mmcgrath> or is that not how it's to be setup?
20:31:27 <dgilmore> mmcgrath: not really how its designed
20:31:43 <mmcgrath> k, well when the time comes I'll leave it to you.
20:32:00 <mmcgrath> I would like to have this all up and running asap.  only because it takes so long to get going, I'd hate for this to bump up into the alpha.
20:32:09 <dgilmore> our useage is really outside of there normal use case
20:32:10 <mmcgrath> dgilmore: do you know when the equalogic will ship?
20:32:18 <dgilmore> mmcgrath:  let me ask my boss
20:32:20 <smooge> when will we get it out there? and do you need me to be there physically?
20:33:08 <dgilmore> smooge: someone will need to rack it
20:33:14 <mmcgrath> smooge: I don't think we will, jonathan setup the last one.  I suspect he will this time too.
20:33:18 <dgilmore> and hook up serial port
20:33:32 <smooge> okit dokit
20:33:35 <dgilmore> but i hope johnathan can do it
20:34:27 <mmcgrath> k, any other questions on that?
20:34:31 <dgilmore> with the second unit i want to do raid10 over the 32 1tb drives in the 2 units
20:34:32 <mmcgrath> if not we'll move on.
20:34:41 <dgilmore> but im done now
20:35:07 <mmcgrath> alrighty
20:35:08 <mmcgrath> with that
20:35:11 <mmcgrath> #topic open floor
20:35:15 <mmcgrath> anyone have anything they'd like to discuss?
20:35:34 <mdomsch> mmcgrath, did you ever #endmeeting the meetbot in -admin ?
20:35:43 * ricky checked this morning and it was gone
20:35:49 <mdomsch> ok
20:35:59 <mmcgrath> mdomsch: I did :)
20:36:13 <mmcgrath> towards the end of the day after traffic was nearing back down to normal I ended it.
20:36:21 <mmcgrath> I should have sent the logs to the list, but they are available
20:36:26 <mdomsch> np
20:36:35 <mdomsch> it wasn't a very exciting day - just like I like it
20:36:52 <mdomsch> either our processes have gotten so good that release day is a non-event
20:36:56 <mdomsch> or our traffic was down, or both
20:37:10 <mmcgrath> no doubt.
20:37:17 * mmcgrath thinks it was a little of both.
20:37:18 <mdomsch> thanks to Oxf13 for getting the bits to the mirrors so early
20:37:24 <mmcgrath> but if things aren't broke, we're doing what we can :)
20:37:35 <mmcgrath> yeah, that helps too I bet quite a bit
20:37:37 <mdomsch> I didn't hear anyone complain about not getting the bits; I did hear about slow torrents
20:37:45 <mmcgrath> when people show up from the release and can't get to it, I suspect there's a lot more re-loading on our servers.
20:37:48 <mdomsch> but that's also because they aren't advertised so much anymore
20:37:59 <mmcgrath> mdomsch: I had good luck with the torrents but I wasn't paying attention to all of them.
20:38:29 <smooge> is so used to having torrents blocked that I forgot to do anything with them
20:38:38 <mmcgrath> :)
20:38:47 <mmcgrath> ok, anyone have anything else to discuss?
20:38:50 <smooge> not me
20:38:55 <mdomsch> all quiet here
20:39:03 <smooge> going to just shoot the telemarketers who keep calling
20:39:04 <mmcgrath> alrighty, we'll close in 30
20:39:26 <dgilmore> smooge: ive had a bunch of them recently
20:39:48 <mmcgrath> :)
20:39:49 <mmcgrath> ok
20:39:50 <smooge> election people wanting me to vote for someone.. but I unregistered from parties last year so they are wasting their time
20:39:50 <mmcgrath> #endmeeting