18:00:04 <nirik> #startmeeting Infrastructure (2014-08-21) 18:00:04 <zodbot> Meeting started Thu Aug 21 18:00:04 2014 UTC. The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:00:04 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic. 18:00:05 <nirik> #meetingname infrastructure 18:00:05 <nirik> #topic aloha 18:00:05 <nirik> #chair smooge relrod nirik abadger1999 lmacken dgilmore mdomsch threebean pingou puiterwijk 18:00:05 <zodbot> The meeting name has been set to 'infrastructure' 18:00:05 <zodbot> Current chairs: abadger1999 dgilmore lmacken mdomsch nirik pingou puiterwijk relrod smooge threebean 18:00:10 * threebean is here 18:00:14 * puiterwijk is arround 18:00:20 * relrod here 18:00:25 <mpduty> is here 18:00:29 * banas is here as well 18:00:44 * bochecha is here 18:01:04 * lmacken 18:01:19 * adimania is here 18:01:51 * lanica is here for the infra meeting. 18:02:02 <nirik> hello everyone. ;) lets go ahead and get started... 18:02:05 <nirik> #topic New folks introductions and Apprentice tasks 18:02:14 <nirik> any new folks? or apprentices with questions/comments/ideas? 18:03:08 <mpduty> can there be an IRC class on ansible? 18:03:15 <banas> I can't remember if this is the slot I'm supposed to update - but work on GG is going good, as per schedule :) 18:03:17 <sart> is here in an un-italicized fashion 18:03:43 <nirik> mpduty: we had one a while back, we could do another if there's interest, sure. ;) 18:04:49 <banas> we had to skip two gg meetings because I was at Flock, and now the whole team is sick, so we just decided to do it next week. 18:04:52 <mpduty> nirik, well I don't know but some newcomers would be able to learn things faster I feel 18:05:08 <mpduty> including myself 18:05:20 * mirek-hm is late, but here 18:05:26 <nirik> mpduty: sure, we could ask on the list if folks are interested. I'm happy to do some intro one. We also had a nice intro at flock that was recorded... ;) 18:05:29 <sart> <-- long-time newcomer 18:05:35 * adimania thinks that it might be helpful to everyone 18:06:04 <nirik> https://www.youtube.com/watch?v=sCXCgsmQuSY&feature=youtu.be 18:06:18 * lbazan here 18:06:25 <mpduty> thanks I shall go through that 18:06:33 <nirik> welcome sart. ;) 18:07:06 <sart> thanks and ty for the intro vid 18:08:11 <nirik> ok, any other new folks or questions? 18:08:31 * pjones waves hello 18:08:48 <nirik> hey pjones. :) 18:09:04 <nirik> #topic Applications status / discussion 18:09:18 <nirik> pingou wasn't able to make it but had a few things for me to pass on: 18:09:28 <nirik> #info new pkgdb2 release in production this week. 18:09:50 <nirik> #info critpath lists are now working again. 18:10:22 <nirik> #info infrastructure jenkins is updated to the latest plugins, etc. 18:10:29 <nirik> also, tflink wanted to share: 18:10:46 <nirik> #info taskotron in stg/dev is going along fine, probibly won't be in production until after alpha now. 18:11:06 <nirik> any other applications news? 18:11:32 <threebean> #info we're almost at 10 million fedmsg messages all time 18:11:36 <threebean> https://apps.fedoraproject.org/datagrepper/ 18:11:38 <threebean> :P 18:11:56 <banas> #info GlitterGallery release process planned to be started within this week 18:12:09 <nirik> threebean: nice. :) 18:12:13 <threebean> also, we started a scratch pad for discussing promoting FMN from an opt-in service to an opt-out service (for packagers) 18:12:15 <threebean> http://piratepad.net/FMN-opt-out 18:12:20 <lmacken> threebean: woot! 18:12:47 <lmacken> threebean: 10mil & 1TiB+ of traffic? :) 18:12:50 <nirik> do we need bodhi2 in place? 18:13:07 <threebean> optimizations pushed out last friday seem to have FMN keeping it's workload under control. 18:13:11 <threebean> .tiny https://admin.fedoraproject.org/collectd/bin/graph.cgi?hostname=notifs-backend01.phx2.fedoraproject.org;plugin=fedmsg;plugin_instance=hub;type=queue_length;type_instance=FMNConsumer_backlog;begin=-706400 18:13:12 <zodbot> threebean: http://tinyurl.com/le9xywt 18:13:35 <pjones> threebean: and *very* few of those are me trying to game badges 18:13:44 <threebean> pjones: almost none 18:13:58 <threebean> ;) 18:14:35 <lmacken> threebean: didn't the optimizations DoS FAS though? 18:15:22 <threebean> they did :/ (although only once at startup) 18:15:28 <nirik> we have been seeing fas servers complain the last few days... 18:15:36 <nirik> but I haven't had a chance to look at what might be causing it. 18:15:57 <threebean> many threads all trying to cache fas at the same time - fixed in git http://da.gd/BW60H 18:16:21 <threebean> nirik: I think only some of those are related to fmn, which should only be doing this at startup. 18:16:39 <threebean> I'd think the fas problems would have gone down due to the fedmsg-fasclient stuff pingou pushed out.. :/ 18:16:54 <nirik> yeah. theres a lot less runs of that for sure. 18:17:06 <nirik> but something is still causing them to start swapping. 18:17:14 <nirik> more investigation needed 18:18:39 <nirik> we have also seen recently openvpn hitting cpu limits. 18:18:45 <nirik> might be related, not sure. 18:19:02 <lmacken> nirik: were you thinking that was related to outbound email? 18:19:18 <nirik> thought so at first, but I was looking at the wrong thing. ;) 18:19:20 <lmacken> i've been seeing a lot more fedmsg error spam, along with the usual batch of fedora-packages 200k mails :P 18:19:23 <lmacken> oh, okay. 18:19:36 <nirik> If you sniff the traffic on the openvpn tun device you only see traffic going to the node you are on. 18:19:50 <nirik> if you sniff the eth device you see all the vpn streams, etc. 18:20:05 <nirik> it looked last night like proxy02 was pushing a lot more than the others, but that might have just been at that time 18:20:33 <nirik> I'll keep trying to isolate it. 18:21:16 <nirik> Any other application news? :) 18:21:21 <lmacken> when do we freeze? 18:21:31 * threebean forgot about freeze 18:22:14 <nirik> when we have a viable test compose. ;) 18:22:16 <pjones> some time after dgilmore winds up having working trees for all the stuff. 18:22:23 <pjones> which is... taking a while. 18:22:24 <nirik> yeah... hopefully soon. 18:22:32 <nirik> so many bugs. 18:22:56 <pjones> nirik: looks like createImage is still busting on cloud-atomic? 18:23:03 <pjones> http://koji.fedoraproject.org/koji/taskinfo?taskID=7435960 <-- the screenshot is horrifying 18:23:11 <nirik> pjones: yeah, boggling. ;( 18:23:20 <pjones> but now I'm just interrupting 18:23:46 <nirik> we all like a good horror sideshow. ;) 18:24:00 <nirik> #topic Sysadmin status / discussion 18:24:06 <nirik> so, on the sysadmin side of things... 18:24:22 <nirik> #info retrace servers have been handed off to retrace folks. They are setting things up on them now. 18:24:32 <nirik> thanks to smooge for getting those all installed. 18:25:14 <nirik> There's the openvpn and fas hiccups we have been seeing, but we already mentioned those. 18:25:25 <lmacken> pjones: if only that used chained exceptions in py3, that screenshot might be a bit more useful :\ 18:25:34 <nirik> We still have qa09 and virthost-comm03 to setup as new machines. 18:25:50 <pjones> lmacken: Eh, I think it's just saying the image the python loaded from is corrupt. I doubt the exception is meaningful at all. 18:26:03 <lmacken> pjones: yeah, looks like an exception in the exception handler :\ 18:27:37 <nirik> there's also a bit of discussion ongoing about netapp space (since we are getting very dangerously full on our koji storage) 18:27:47 <nirik> hopefully there will be some good news there sometime soon. 18:27:58 <threebean> hm, turns out that our staging and production environments weren't firewalled off from one another anymore after the switch to ansible. that's fixed up now. 18:28:23 <nirik> oh yeah, thanks much for fixing that. 18:28:39 <threebean> np. things might behave funny while it shakes out. 18:28:50 <nirik> I've so far not seen any fallout from that... 18:28:56 <nirik> but that doesn't mean we won't hit some 18:30:11 <nirik> #info memcached is now setup to restart on exit and also is monitored in nagios 18:30:35 <nirik> #info staging and prod are now once again blocked from talking to each other via firewall rules. 18:31:28 <nirik> #topic nagios/alerts recap 18:31:33 <nirik> this should be fun this week... 18:31:37 * nirik digs up link 18:32:11 <nirik> .tiny https://admin.fedoraproject.org/nagios/cgi-bin//summary.cgi?alerttypes=3&displaytype=3&eday=15&ehour=24&emin=0&emon=5&esec=0&eyear=2014&host=all&hostgroup=all&hoststates=3&limit=25 18:32:11 <zodbot> nirik: http://tinyurl.com/ms5u5qm 18:32:17 <nirik> I think thats right. 18:32:59 <nirik> no, not right. 18:33:30 <nirik> .tiny https://admin.fedoraproject.org/nagios/cgi-bin//summary.cgi?report=1&displaytype=3&timeperiod=last7days&smon=8&sday=1&syear=2014&shour=0&smin=0&ssec=0&emon=8&eday=21&eyear=2014&ehour=24&emin=0&esec=0&hostgroup=all&servicegroup=all&host=all&alerttypes=3&statetypes=3&hoststates=7&servicestates=120&limit=25 18:33:31 <zodbot> nirik: http://tinyurl.com/koaopnc 18:34:03 <nirik> almost all of those are due to the vpn issue. 18:34:15 <nirik> the server hits 100% cpu and starts dropping packets. 18:35:00 <nirik> #topic Upcoming Tasks/Items 18:35:00 <nirik> https://apps.fedoraproject.org/calendar/list/infrastructure/ 18:35:12 <nirik> anything upcoming folks would like to schedule or note? 18:35:18 <nirik> the freeze is hopefully coming up soon. 18:36:19 <nirik> Also, we are in the early planning stages for another FAD in december... 18:36:45 <nirik> https://fedoraproject.org/wiki/FAD_MirrorManager2_FAS3_2014 18:36:54 <adimania> would remote participation be possible? 18:36:59 <nirik> absolutely. 18:37:41 <adimania> cool. Flying is very expensive in my part of the world :( 18:37:42 <nirik> it's very subject to change right now, just sounding out if it will be possible and useful. 18:37:56 <threebean> just to note: i'm on some on-again, off-again vacation for the next two weeks. (I'll be around on tuesdays and thursdays) 18:38:10 <lmacken> yeah, there's a good chance that I won't be in Denver during that time 18:38:18 <nirik> threebean: cool. 18:38:24 <sart> I'm definitely interested if I'm not at work (or possibly if so but work is completely dead, which it may be that time of year... ) 18:38:29 <lmacken> so a backup location is definitely needed 18:38:29 <nirik> lmacken: fun. :) So, we might want to pick another place or time 18:39:10 <nirik> yeah, we will come up with something. ;) 18:39:44 <nirik> Oh, I should mention that I am likely going to be traveling the latter part of september... 18:40:23 <nirik> #topic Open Floor 18:40:38 <nirik> anyone have anything for open floor? comments, suggestions, ideas, favorate cookies? 18:41:47 <lanica> mmm�cookies. 18:42:10 * adimania eats 18:42:16 <threebean> encrypted browser cookies? 18:42:16 * adimania nom nom nom 18:42:28 <nirik> chocolate chip. ;) Or molassass sugar. nom. 18:42:39 <sart> oatmeal butterscotch 18:43:46 <nirik> alrighty. If nothing else will close out in a minute... (to go find cookies) 18:44:11 <lanica> Good meeting�later all! 18:44:39 <nirik> Thanks for coming everyone. Lets all continue in #fedora-admin, #fedora-apps and #fedora-noc. 18:44:41 <nirik> #endmeeting