20:00:24 <mmcgrath> #startmeeting infrastructure 20:00:24 <zodbot> Meeting started Thu Sep 16 20:00:24 2010 UTC. The chair is mmcgrath. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:00:24 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic. 20:00:29 <mmcgrath> #meetingname infrastructure 20:00:29 <zodbot> The meeting name has been set to 'infrastructure' 20:00:33 <mmcgrath> #topic who's here 20:00:35 * lmacken 20:00:35 <mmcgrath> who's here? 20:00:38 <abadger1999> hey 20:01:30 * mmcgrath waits a bit 20:02:10 <abadger1999> mmcgrath: Wanna have open floor first? :-) 20:02:26 <mmcgrath> abadger1999: actually that would be great, I'm needing to get tickets in place for the beta release. 20:02:33 <mmcgrath> We can end with it too just in case 20:02:36 * mdomsch 20:02:38 <mmcgrath> #topic Open Floor 20:02:41 <mmcgrath> abadger1999: you have something? 20:02:45 <abadger1999> yeah 20:02:46 <abadger1999> https://fedoraproject.org/wiki/LATAM_Infrastructure 20:03:01 <abadger1999> Talked to the latam infra people a few weeks ago and been forgetting to bring it up. 20:03:14 <mmcgrath> Yeah I had a conversation or two with them as well 20:03:14 <abadger1999> I had them list some of the things that they need. 20:03:25 <mmcgrath> basically got just far enough to tell them not to auth against us except for using ssh keys. 20:03:36 <abadger1999> I don't know how we can satisfy them but I figure knowing what the issues are is the first step. 20:03:38 <mmcgrath> which is not particualrly helpful for what they're trying to do unfortunately :( 20:03:55 <smooge> here 20:04:06 <mmcgrath> abadger1999: did they want us to host their DNS? 20:04:26 <abadger1999> mmcgrath: They want to get it so that it's not just Rodrigo being the contact. 20:04:44 <abadger1999> mmcgrath: I think that they're for us hosting it... figured that would be pretty easy to do. 20:04:56 <mmcgrath> yeah it's a transfer, and something we've done several times before. 20:05:02 <mmcgrath> it is, however, time consuming for some reason. 20:05:05 <mmcgrath> it just takes a whiel. 20:05:14 <smooge> a looooong while 20:05:23 <smooge> 1 year if you are in Malaysia 20:05:27 <mmcgrath> abadger1999: can you give a roundup of what all you talked about and what they're wanting to do? 20:05:49 <abadger1999> Easy stuff: get away from single points of person failure 20:06:04 <abadger1999> Like transfering DNS to fedora project so that one person can't take away the domain. 20:06:25 <abadger1999> Social stuff - integrate better into fedora. 20:06:29 <mmcgrath> do they have a team of sysadmins? 20:06:39 <mmcgrath> or something similar at least? 20:06:52 <abadger1999> ie: right now latam infra and community is pretty isolated from the noramerican/Europeans. 20:06:56 <abadger1999> Yes. 20:07:08 <abadger1999> All volunteers so they don't have as mch time as we do. 20:07:12 <abadger1999> Nor the hardware we do. 20:07:21 * dgilmore turns up 20:07:29 <abadger1999> But gomix nushio dbruno are all on the sysadmin team. 20:07:30 <mmcgrath> are they wanting to make websites for non-latam people? 20:07:51 <abadger1999> Not sure -- They want to make web apps for non-latam people. 20:07:54 <mmcgrath> or are they just focusing on it, but would like better access to the rest of the community for... idea sharing? I'm not sure what word I want to use there. 20:08:02 <mmcgrath> knowledge pool is probably better. 20:08:12 <abadger1999> timpus -- events platform for all of the ambassadors everywhere. 20:08:23 <abadger1999> for instance. 20:08:41 <abadger1999> So they're more than just websites/documents. 20:09:01 <mmcgrath> and they're looking to host that for the larger ambassador community? 20:09:07 <abadger1999> Right. 20:09:24 <mmcgrath> I'm generally for that, I know this is probably a tough pill for some to swallow and might look weird. 20:09:39 <mmcgrath> but if we can properly empower teams like that to host their stuff, it lowers the barriers for them to create those apps 20:09:49 <abadger1999> Yep. I agree. 20:09:50 <mmcgrath> while allowing us to keep the high quality architecture we currently have. 20:10:04 <abadger1999> i'm just not sure of how to make it all smooth. 20:10:22 <abadger1999> Like how to make the events platform auth against fas in a way that doesn't compromise security. 20:10:25 <mmcgrath> so we don't end up committing to a bunch of... side apps? I'm not sure how to say that without seeming negative because I'm really for teams being able to provide for themselves where they are able. 20:10:39 <mmcgrath> abadger1999: yeah, that's the big 'got'cha' right now 20:11:18 <abadger1999> gomix and nushio will be at fudcon tempe so it might be good to have some plans around figuring out what we can do there. 20:11:27 <mmcgrath> yeah 20:11:27 <abadger1999> But also figuring out options right now would be good. 20:12:02 <mmcgrath> abadger1999: one thing I wanted to think about is if there's any sort of auth mechanism where the password itself never leaves the browser. 20:12:05 <abadger1999> Like SSL auth for their sites Or something. 20:12:07 <mmcgrath> but the encrypted form would? 20:12:20 <mmcgrath> I'm not sure how sensitive encrypted passwords should be considered. 20:12:28 <ninjazjb> Hello everyone, this is Jason Brown 20:12:31 <mmcgrath> just something else I thought was worth investigating. 20:12:38 <mmcgrath> ninjazjb: hello Jason, glad you could make it 20:12:49 <abadger1999> mmcgrath: There is -- but you still have to be careful about replay attack or simply, MITM causing something different than you expect to happen. 20:12:59 <ninjazjb> Thanks 20:13:00 <dgilmore> mmcgrath: id feel more comfortable with using ssl auth 20:13:41 <mmcgrath> abadger1999: yeah, I guess a replay could cause other non-official sites to get jacked at that point 20:13:45 <mmcgrath> anyway, a conversation for another time. 20:13:49 <mmcgrath> abadger1999: what else you got? 20:14:11 <abadger1999> That's it from me for now -- just wanted to get us thinking about it before fudcon. 20:14:14 <dgilmore> not that i think they would do it but there would be the potential to harvest passwords which would take constant code audits tomake sure it doesnt happen 20:14:18 <abadger1999> And point out the wiki page with the brainstorming 20:14:48 <mmcgrath> abadger1999: thanks 20:14:58 <mmcgrath> Ok, if no one has anything else on that, we'll get down to the F14beta business. 20:15:40 <mmcgrath> ok, lets do it 20:15:47 <mmcgrath> #topic Fedora 14 Beta. 20:16:11 <mmcgrath> https://fedorahosted.org/fedora-infrastructure/report/9 20:16:18 <mmcgrath> .ticket 2392 20:16:19 <zodbot> mmcgrath: #2392 (New website) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2392 20:16:30 * mmcgrath tries to summon sijis 20:16:57 <mmcgrath> we can skip that one for now 20:17:00 <mmcgrath> .ticket 2393 20:17:01 <zodbot> mmcgrath: #2393 (Verify Mirror Space) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2393 20:17:03 <mmcgrath> I'll get this one 20:17:26 <mmcgrath> .ticket 2394 20:17:30 <zodbot> mmcgrath: #2394 (Release day ticket) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2394 20:17:46 <mmcgrath> I'll nab that 20:17:51 <mmcgrath> actually 20:17:59 <mmcgrath> smooge: do you want to do the release day coordination this time? 20:18:47 <mmcgrath> we'll come back to that 20:18:53 <mmcgrath> .ticket 2392 20:18:54 <zodbot> mmcgrath: #2392 (New website) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2392 20:19:00 <mmcgrath> sijis: will we have a fancy new beta site? 20:19:16 <sijis> yep. i think we definitely will 20:19:17 <mmcgrath> or the old beta site? what's the plan there? 20:19:27 <sijis> well. for GA == new site 20:19:35 <sijis> for Beta = existing site 20:19:42 <smooge> mmcgrath, is that putting in new tickets or doing overall tickets 20:19:43 <mmcgrath> you guys sure you don't want to release that a bit earlier then the actual release day? 20:19:49 <mmcgrath> smooge: one sec 20:19:57 <smooge> np slow typing 20:20:46 <mmcgrath> sijis: ok, well I do look forward to the new site. Are you going to be point person for this release? 20:20:50 <sijis> mmcgrath: i don't think we'll have th site completely finished for beta 20:20:56 <sijis> yup 20:21:11 <mdomsch> I'd be happy to see the new website live a few days ahead of the release... 20:21:17 <mdomsch> even if it's not done by beta 20:21:30 <mdomsch> build momentum for the actual release day 20:21:31 <mmcgrath> sijis: can you accept that ticket? 20:21:32 <smooge> a week before release :)? 20:21:34 <mmcgrath> mdomsch: yeah 20:21:36 <mmcgrath> ok 20:21:39 <mdomsch> and not risk blowing things up on release day 20:21:45 <mmcgrath> mdomsch: that's my main concern. 20:21:52 <sijis> mmcgrath: will do 20:21:57 <mmcgrath> sijis: thanks 20:22:08 <sijis> you mean ticket 2392 or another one? 20:22:20 <mmcgrath> sijis: 2392 20:22:25 <mmcgrath> we'll move on to the next ticket :) 20:22:28 <mmcgrath> .ticket 2394 20:22:29 <zodbot> mmcgrath: #2394 (Release day ticket) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2394 20:22:36 <mmcgrath> smooge: would you like to do this? 20:22:53 <mmcgrath> It's basically just making sure everything gets done prior to us sending the announcement out. 20:22:53 <smooge> taking 20:23:06 <mmcgrath> for example, the website should be up and ready, all the links should work 20:23:09 <mmcgrath> that sort of thing. 20:23:18 <mmcgrath> smooge: sweet 20:23:20 <mmcgrath> ok 20:23:25 * sijis will make sure links work this time :) 20:23:52 <mmcgrath> smooge: the only downside for you is I think you'll have to get up early because release time is 8:00 am your time. 20:24:01 <smooge> I am of the opinion that for some of our audience a web page with a long list of href's is all we ever need :/ 20:24:06 <mmcgrath> the website should generally get started around 7:30 your time because it takes a while to sync, that sort of thing. 20:24:20 <mmcgrath> ok, well it'll be good to have someone else go through that process for a change anyway 20:24:24 <smooge> ah ok so that day I need to be up at 0400 20:24:24 <mmcgrath> smooge: any questions? 20:24:33 <mmcgrath> :) 20:24:33 <smooge> to get coffee into system 20:24:43 <smooge> when is it currently planned? 20:24:53 <smooge> October? 20:25:02 <smooge> Or are we talking beta 20:25:06 <mmcgrath> September 28th 20:25:10 <smooge> crap 20:25:12 <mmcgrath> this one's the beta 20:25:13 <smooge> I can't do that 20:25:20 <smooge> I am in RDU that day for class. 20:25:22 <mmcgrath> that's ok, that's why we discuss these things :) 20:25:25 <mmcgrath> I'll grab that one 20:25:32 <smooge> I will be up though :) 20:25:51 <mmcgrath> ok, next ticket 20:25:57 <mmcgrath> .ticket 2395 20:25:58 <zodbot> mmcgrath: #2395 (Verify releng permissions) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2395 20:26:04 <mmcgrath> smooge: you want to get that one? 20:26:12 <smooge> taking 20:26:34 <mmcgrath> .ticket 2396 20:26:35 <zodbot> mmcgrath: #2396 (Add MirrorManager repository redirects) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2396 20:26:38 <mmcgrath> mdomsch: got it? 20:26:47 <mdomsch> yup 20:27:14 <mmcgrath> excellent 20:27:19 <mmcgrath> .ticket 2397 20:27:20 <zodbot> mmcgrath: #2397 (Infrastructure Change Freeze.) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2397 20:27:26 <mmcgrath> I'll accept this one, we're already in the freeze. 20:27:32 <mmcgrath> enjoy all the new infrastructure-list traffic :) 20:27:42 <mmcgrath> and last 20:27:44 <mmcgrath> .ticket 2398 20:27:45 <zodbot> mmcgrath: #2398 (Lessons Learned) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2398 20:27:48 <mmcgrath> that's for after the release. 20:28:21 <smooge> ok got it 20:28:36 <mmcgrath> and that's that 20:28:40 <mmcgrath> Oxf13: ping 20:28:47 <Oxf13> mmcgrath: hi 20:28:49 <smooge> is he on a plane 20:28:54 <smooge> no he is on the ground 20:29:03 <mmcgrath> Oxf13: time for your favorite question. You got any odds for chance of beta slip? 20:29:39 <Oxf13> I have no idea, I'm out of the loop 20:30:03 <mmcgrath> yeah I know you've been busy. 20:30:13 <mmcgrath> Oxf13: who might know better? 20:30:47 <Oxf13> jlaska/adamw of QA, dlehman of Anaconda, dgilmore/notting of releng 20:30:53 <mmcgrath> adamw: ping 20:30:54 <mmcgrath> jlaska: ping 20:30:57 <jsmith> mmcgrath: https://bugzilla.redhat.com/showdependencytree.cgi?id=611991&hide_resolved=1 20:31:05 <jsmith> mmcgrath: That might be a pretty good indicator :-/ 20:31:13 <jlaska> mmcgrath: we have 2 bugs left 20:31:17 * jlaska just sent mail to devel list 20:31:27 <dgilmore> the last bugs are supposed to be fixed today 20:31:27 <mmcgrath> jlaska: you don't have to commit to it but you think probably not goign to slip? like 10% chance? 20:31:30 <jlaska> we need someone from kernel to provide guidance on bug#629719 20:31:45 <dgilmore> once we compose that and start testing we will know better 20:31:50 <jlaska> mmcgrath: if we can't compose an RC on time, chances aren't good 20:32:09 <mmcgrath> k, I'll follow up again next week. 20:32:27 <jlaska> there are only 2 bugs remaining ... the installer issue dlehman has a handle on ... but we need kernel guidance for the remaining dmraid issue 20:32:28 <mmcgrath> sounds like there's still some unknowns. 20:32:34 <mmcgrath> <nod> 20:32:43 <jlaska> much better shape than yesterday, but still not 0 20:32:52 <mmcgrath> jlaska: thanks 20:33:01 <mmcgrath> ok, anyone have any questions, comments or concerns wrt the beta release? 20:33:17 <jlaska> hmmm ... 20:33:38 <mmcgrath> alrighty :) 20:33:47 <mmcgrath> #topic Strange pkgdb / bodhi outages on app5/6 20:34:18 <mmcgrath> so I was working with abadger1999 and lmacken just before the freeze to try to figure out what on earth was going on with apps on app5 and 6. 20:34:36 <mmcgrath> for those of you that don't know, basically app5 and 6 are considered backups. they don't get live traffic because they're offsite. 20:34:47 <mmcgrath> but, if for some reason all the production app servers go down, they pick up the slack. 20:35:08 <mmcgrath> well, even with no traffic, sometimes bodhi or pkgdb would hang, somtimes for hours. 20:35:14 <mmcgrath> and then they'd recover on their own. 20:35:17 <mmcgrath> it was incredibly strange. 20:35:25 <mmcgrath> the hosts were low load, db access was fine. 20:35:40 <mmcgrath> and both being tg apps it was extra strange that both of them getting in that state at the same time on the same server was low 20:36:02 <mmcgrath> I'm still not sure of a root cause, but I believe some of the wsgi processes were hanging, which was causing apache to block new requests from getting in. 20:36:21 <mmcgrath> So to bandaid that, we increased the number of processes available to each. 20:36:24 <mmcgrath> and so far. good luck. 20:36:29 <mmcgrath> I haven't seen any outage 20:36:39 <mmcgrath> at least not related to that 20:36:50 <mmcgrath> we have had some from the database filling up 20:36:57 <mmcgrath> anywah, any questions or comments on that? 20:37:40 <mmcgrath> alrighty 20:37:44 <mmcgrath> #topic pkgdb caching 20:37:59 <mmcgrath> abadger1999: any issues seen since we started caching image content? 20:38:17 <abadger1999> It's been smooth. 20:38:20 <mmcgrath> .headers https://admin.fedoraproject.org/pkgdb/appicon/show/Terminator 20:38:21 <zodbot> mmcgrath: apptime: D=215947, content-length: 3412, x-varnish: 2111766604, age: 0, expires: Tue, 21 Sep 2010 20:38:20 GMT, connection: close, server: Apache/2.2.3 (Red Hat), appserver: app03.phx2.fedoraproject.org, proxyserver: proxy01.phx2.fedoraproject.org, via: 1.1 varnish, cache-control: max-age=432000, date: Thu, 16 Sep 2010 20:38:20 GMT, content-type: image/png, proxytime: D=218092 20:38:29 <mmcgrath> .headers https://admin.fedoraproject.org/pkgdb/appicon/show/Terminator 20:38:29 <zodbot> mmcgrath: apptime: D=215947, content-length: 3412, x-varnish: 2111766630 2111766604, age: 9, expires: Tue, 21 Sep 2010 20:38:29 GMT, connection: close, server: Apache/2.2.3 (Red Hat), appserver: app03.phx2.fedoraproject.org, proxyserver: proxy01.phx2.fedoraproject.org, via: 1.1 varnish, cache-control: max-age=432000, date: Thu, 16 Sep 2010 20:38:29 GMT, content-type: image/png, proxytime: D=664 20:38:35 <abadger1999> mmcgrath: I don't know how much it helped -- need to ask mbacovsk or someone on the other end of a slow pipe from the servers. 20:38:36 <mmcgrath> hey hey, age. that's what I like to see. 20:38:59 <mmcgrath> for me I got the time generally cut in half. 20:39:05 <mmcgrath> but it's still several seconds for a large page list. 20:39:09 <abadger1999> <nod> 20:39:12 <mmcgrath> expires headers does seem to be working properly 20:39:50 <smooge> how goes varnish with this? 20:40:28 <mmcgrath> abadger1999: one thing I've noticed... expires doesn't seem to be working 20:40:31 <mmcgrath> and i'm not sure why 20:40:38 <mmcgrath> my browser has these iamges, it shouldn't be re-requesting them. 20:40:45 <mmcgrath> it could be related to the auth / cookie. I need to research it. 20:41:05 <mmcgrath> smooge: well, basically we have set aside a part of the pkgdb namespace (/pkgdb/appicon/show) 20:41:10 <mmcgrath> and we're doing two things with it 20:41:27 <mmcgrath> when a cookie gets sent, varnish unsets it to request the data, when it does get the data, it unsets the cookie and sends it back. 20:41:39 <mmcgrath> because cherrypy wants to set a cookie with every request. 20:42:08 <smooge> ah ok 20:42:15 <mmcgrath> anyone have any questions or comments? 20:42:25 <mmcgrath> or ideas as to why firefox is ignoring the expires header :) 20:42:37 <mmcgrath> abadger1999: etagging would be helpful here. FWIW. 20:42:55 <mmcgrath> ok, that's all I've got 20:42:59 <mmcgrath> #topic Open Floor 20:43:05 <mmcgrath> anyone have anything else they'd like to discuss? 20:43:07 <mmcgrath> anything at all? 20:43:12 <smooge> fas 20:43:25 <mmcgrath> smooge: hit it 20:43:28 <smooge> we are having issue with the fas servers at the moment 20:43:40 <mmcgrath> oh right right 20:43:44 <mmcgrath> that was on my list and I forgot :) 20:43:54 <smooge> we have an open bugzilla on it and I am trying to get the data to developers as soon as possible 20:44:14 <smooge> it looks like something with swap space just not working under certain loads 20:44:39 <smooge> and when swap space quits working.. OOM gets hungry 20:45:07 <smooge> so we have some interesting OOPS but not much else. 20:45:28 <smooge> It seems to occur on the servers rather regularly at 03:30-03:50 20:45:30 <mmcgrath> interesting 20:45:33 <smooge> but not sure why 20:45:35 <mmcgrath> I'm surprised we're using swap on there at all 20:45:37 <mmcgrath> https://admin.fedoraproject.org/collectd/bin/index.cgi?hostname=fas02&plugin=swap×pan=604800&action=show_selection&ok_button=OK 20:45:41 <smooge> http grows 20:45:45 <mmcgrath> even still, its not a lot. 20:46:00 <smooge> no it isnt.. and when the problem occurs it is not like its heavy in swap 20:46:02 <mmcgrath> smooge: are they still all rebooting at least every 24 hours? 20:46:08 <smooge> just all of a sudden no more swap for you 20:46:23 <smooge> well the new kernel has slowed that down a bit 20:46:39 <smooge> but not sure why. I am expecting tonight to be a hit 20:46:47 <mmcgrath> k 20:46:53 <mmcgrath> smooge: thanks for following up and tracking that issue 20:46:57 <smooge> 2 nights ago we had all 3 reboot and looking at the db02 data 20:47:08 <smooge> we had a TON of fas connections beyond normal at that time 20:47:12 <smooge> not sure why yet 20:47:54 <mmcgrath> yeah 20:48:48 <smooge> EOF 20:48:53 <mmcgrath> alllrighty 20:48:55 <mmcgrath> thanks :) 20:49:01 <smooge> np 20:49:03 <mmcgrath> if no one has anything else, we'll close in 30 20:49:13 <smooge> allergies killing me softly with sneezes 20:49:20 <mmcgrath> bummer :( 20:49:24 <rbergeron> .... 20:49:26 <mmcgrath> that's never fun 20:49:30 <gholms> As usual, the Cloud SIG meeting starts at the top of the hour for those of you who are interested. ;) 20:49:39 <mmcgrath> and that's it! 20:49:40 <mmcgrath> #endmeeting