infrastructure
LOGS
19:00:04 <nirik> #startmeeting Infrastructure (2014-02-27)
19:00:04 <zodbot> Meeting started Thu Feb 27 19:00:04 2014 UTC.  The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:00:04 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
19:00:04 <nirik> #meetingname infrastructure
19:00:04 <nirik> #topic Greetings starfighter!
19:00:04 <nirik> #chair smooge relrod nirik abadger1999 lmacken dgilmore mdomsch threebean pingou puiterwijk
19:00:04 <zodbot> The meeting name has been set to 'infrastructure'
19:00:04 <zodbot> Current chairs: abadger1999 dgilmore lmacken mdomsch nirik pingou puiterwijk relrod smooge threebean
19:00:10 <abadger1999> howdy
19:00:34 <janeznemanic> hi
19:00:35 * adimania is here
19:00:41 * pingou 
19:00:44 * lmacken 
19:01:03 * threebean is here
19:01:05 <relrod> here
19:01:08 * ausmarton is here
19:01:19 * docent is here
19:01:25 * willo is here
19:01:32 <danofsatx-work> here
19:01:33 <nirik> morning everyone. ;)
19:01:43 <danofsatx-work> but not for long, need to reload F20 :(
19:01:48 <nirik> #topic New folks introductions and Apprentice tasks
19:01:54 <nirik> danofsatx-work: fun times. ;)
19:01:58 * kushalk124 is here
19:02:09 <nirik> any new folks like to introduce themselves? or apprentices with questions or comments?
19:02:22 <danofsatx-work> new system - old load isn't optimized for it, so I get to redo it, yet again....
19:03:55 <nirik> ok, moving along then... as always feel free to chime in with questions or comments anytime.
19:04:02 <nirik> #topic Applications status / discussion
19:04:14 <nirik> any application side news this week?
19:04:26 * mirek is here
19:04:26 <pingou> new fedocal in prod
19:04:32 <pingou> new nuancier in prod
19:04:33 * fchiulli is here.  Sorry for being late.
19:04:46 <pingou> (and new(er) nuancier in stg w/ fedmsg integration -- need to test this)
19:04:58 <nirik> pingou: cool. ;)
19:05:08 <mirek> those problems with copr (caused by createrepo_c) are hopefuly solved
19:05:09 <abadger1999> Cool.
19:05:13 <nirik> pingou: this is full nuancier now right? not lite?
19:05:23 <pingou> nirik: yup :)
19:05:24 <pingou> all features included
19:05:30 <pingou> https://apps.fedoraproject.org/nuancier
19:05:39 <pingou> and with a nicer frontpage :)
19:05:48 <nirik> mirek: cool. So it was createrepo_c sucking up all memory? or ?
19:05:50 <pingou> mirek: nice!
19:06:21 <mirek> nirik: yes, I even seen process with 10GB RAM.
19:06:29 <pingou> and with threebean we pushed some commits to summershum (support .gem, more info in the logs and on fedmsg)
19:06:42 <nirik> thats a pile. ;(
19:06:43 <mirek> I find the cause and give it to upstream with reproducer
19:07:16 <lmacken> once createrepo_c is semi-stable, getting mash to use it could be a great easyfix task
19:07:16 <nirik> mirek: I should have arm socs for you before too long... need to get dhcpd to not mess up the cloud dhcp and setup a pxe server to install them, etc.
19:07:42 * mirek is happy
19:07:51 <nirik> lmacken: theres still stuff missing tho I think... no deltarpms?
19:08:11 <pingou> (no deltarpms was mentionned at devconf)
19:08:16 <lmacken> nirik: ah, yeah I haven't looked at it too closely, but that's a blocker for sure :)
19:09:02 <nirik> mirek: those ansible module issues you ran into are really weird. it's like something is modifying your pythonpath, but only sometimes?
19:10:24 <mirek> yes, it really puzzle me, I want to spend some time, but today it happen on prod so I was in hurry to return it back online
19:10:34 <nirik> sure, understand
19:11:32 <nirik> oh, I had one thing to note...
19:11:56 <nirik> a while back puiterwijk got our reviewboard on fedorahosted back up and running
19:12:14 <nirik> I keep not having time to poke around on it more... but we should see if it's usable for us for any needs...
19:12:18 <nirik> https://fedorahosted.org/reviewboard/dashboard/
19:12:29 <nirik> it is much faster than before...
19:12:57 <pingou> not accessible w/o fas account?
19:13:03 <sgallagh> nirik: I can assist with administration if there are questions
19:13:09 <pingou> ah, it's the dashboard link
19:13:10 <nirik> pingou: openid
19:13:15 <threebean> nice
19:13:18 <sgallagh> pingou: https://fedorahosted.org/reviewboard/r/ is available directly
19:13:28 <sgallagh> So you can read but not edit.
19:13:45 <sgallagh> puiterwijk elected to make the login page automatically bounce to OpenID.
19:13:49 <nirik> oh reminds me I need to file a bug on the openid part...
19:14:30 <nirik> it makes a local FirstnameLastname user for reviewboard.
19:14:47 <nirik> but... it can't handle users with neat utf8 stuff in name. ;)
19:15:06 <pingou> ^^
19:15:23 <nirik> I'm sure we are shocked. ;)
19:15:35 * pingou looks at abadger1999
19:15:49 <sgallagh> I personally wish he'd just elected to mangle the openid for the username
19:15:56 <nirik> yeah, seems easier.
19:16:00 <sgallagh> sgallagh-id-fedoraproject-org would have worked better.
19:16:20 <abadger1999> <nod>
19:16:34 <sgallagh> And guaranteed not to be overloaded if we have two John Smiths
19:17:00 <nirik> anyhow, I know we have github for many application reviewing needs, but if it's nice enough we could look at it for ansible changes during freeze or the like.
19:17:03 <sgallagh> That's fixable. Please CC me on the bug report.
19:17:17 <nirik> sgallagh: where's the best place to fiile?
19:17:24 <sgallagh> nirik: FWIW, I'm working on Git hooks to be able to manage pull requests through Review Board.
19:17:42 <sgallagh> So you get the nice review UI of RB alongside the process management of github
19:17:52 <nirik> cool.
19:17:57 <sgallagh> nirik: Just use the Infra trac for instance-specific ones
19:18:02 <nirik> k
19:18:39 <nirik> ok, any other application news?
19:18:47 <threebean> kinda application-y:
19:19:04 <threebean> pushed out a nice error logging config for fedmsg this morning -> http://infrastructure.fedoraproject.org/cgit/ansible.git/commit/?id=1b875b543fa414c0971941b3bb2951e7523035aa
19:19:17 <pingou> it's *nice*!
19:19:21 <threebean> so, we'll get error emails from the badges awarder and the notifications daemon.  from summershum too.
19:19:39 <nirik> oh nice. these are when it can't send? or ?
19:19:52 <threebean> well, whenever log.error('blah blah') is called.
19:19:59 <threebean> so its up to each app to catch its own problems and log them.
19:20:16 <pingou> pkgdb2, fedocal and nuancier also send emails
19:20:31 <pingou> I was wondering if we should create an alias to receive these emails
19:20:40 <nirik> yeah, where do they go now?
19:20:51 <threebean> (the fedmsg ones go to sysadmin-datanommer-members@fp.o
19:20:53 <pingou> pkgdb2, fedocal and nuancier to me (only)
19:20:56 <nirik> we do have sysadmin-logs, but thats more sysadminy than applicationy
19:21:14 <pingou> sysapp-logs? :D
19:21:40 * pingou doesn't dare to propose appy-logs
19:21:48 <threebean> we had exceptions from fedora-packages coming to lmacken and I for a while.. but there were just too many.
19:22:05 <nirik> a group is often nice because it's easy to manage who's in it, etc.
19:22:14 <nirik> no aliases to change, etc
19:22:18 <pingou> +1
19:23:01 <pingou> on the app side, I've been working a little on FAS3 today https://github.com/fedora-infra/fas/pull/56
19:23:03 <threebean> could we, create an alias for each app so you don't have to choose the firehose or nothing?
19:23:28 <nirik> threebean: we could, but if they are aliases, that means updating them via puppet (or ansible) and more pain in freezes, etc.
19:23:35 <pingou> maybe use gitproject as for fedorahosted?
19:23:36 * threebean nods
19:24:35 <nirik> how about: fedmsglogs-applicationname? just tracking groups
19:24:57 <pingou> wfm too
19:25:18 <nirik> the git ones might be folks who dont want our specific error logs
19:25:32 <threebean> oo, wait.   I'm not sure how to distinguish fedmsglogs between applications. :/
19:26:10 <pingou> then just <app>-logs, fedmsg-logs being just one of them
19:26:14 * threebean nods
19:26:29 <nirik> sure.
19:27:06 <nirik> ok, any other apps news? ;)
19:27:30 <nirik> #topic Sysadmin status / discussion
19:27:43 <nirik> on the sysadmin side, smooge and I have been having fun with download servers.
19:27:57 <nirik> turns out the load on them has been at least part of the thing slowing our netapp storage down. ;(
19:27:59 <smooge> download download
19:28:29 <nirik> we tried cachefilesd the other day, but it made the machines unstable sadly.
19:28:42 <nirik> so, now we are limiting rsyncs per download server
19:28:57 <nirik> we also have a iptables hashlimit to limit ips that hit rsync too much
19:29:00 <mirek> how big is the traffic (or data transfers)
19:29:02 <mirek> ?
19:29:19 * danofsatx-work makes a note to alter his rsync scripts
19:29:37 <nirik> in bytes/packets? a lot. ;)
19:29:53 <nirik> we did have some ip's hitting 100's of times a day
19:30:20 <danofsatx-work> for the record, that wasn't me ;) I hit it once every 7-10 days
19:30:31 <nirik> we are likely going to be moving storage for them next week.
19:31:07 <willo> i'll be making progress on migration of those servers to ansible this weekend
19:31:10 <nirik> looks like around 10TB a day or so as a ballpark
19:31:43 <nirik> perhaps 15
19:31:47 <nirik> willo: great. ;)
19:32:03 <adimania> migration of paste module to ansible should be good to go.
19:32:21 <adimania> I'll pick up another one this weekend probably.
19:32:22 <nirik> adimania: thanks for working on it. ;)
19:32:40 <adimania> nirik, thanks for all the help :)
19:32:52 <pingou> I wonder if we should track a list of remaining module to port to ansible?
19:33:27 <adimania> pingou, that would be really helpful.
19:33:33 <nirik> pingou: we could start doing that yeah... it's a bit of a mess tho due to puppet having old junk in it that we arent actually using anymore.
19:33:47 <nirik> like for example I think talk.fedoraproject.org/asterisk module is still there.
19:34:07 <nirik> but we could perhaps list machines in puppet only and extrapolate ?
19:34:28 <willo> track on a wiki page maybe?
19:34:54 <smooge> I would go with a trac wiki page :)
19:35:12 <smooge> sorry my humour is off. rebooting
19:35:16 <nirik> we could. would somone like to write up at least part of such a thing? I'd be happy to edit it and add info, etc.
19:35:22 <nirik> smooge: :)
19:35:55 <willo> i'll take a stab
19:36:18 <nirik> willo: cool!
19:36:33 <nirik> lets see...
19:36:34 <willo> i'll email list when outline is done for input
19:36:50 <nirik> #info download servers and netapp i/o has been a big issue this week. ongoing.
19:36:53 <nirik> willo: sounds great.
19:37:09 <nirik> #info more puppet -> ansible conversions are ready to go
19:37:31 <nirik> I have one of our arm chassis up in the cloud network, just need to get dhcpd working and pxe server to install them...
19:38:01 <nirik> Oh, I gave a talk to boulder devops monday night on ansible. My slides are at:
19:38:25 <nirik> http://fedorapeople.org/~kevin/ansible-20140224.odp
19:38:42 <nirik> for anyone who wants them. Not sure how much sense they make without me gibbering over them, but there they are. ;)
19:38:49 <smooge> cool
19:39:10 <willo> so no vid of the gibbering for posting to youtube   ;)
19:39:11 <nirik> we have some new machines arriving tomorrow (I think)
19:39:24 <nirik> willo: sadly no, they are looking for a a/v person, but didn't have one.
19:39:51 <mirek> I have one idea... write something like rbac-playbook but for cloud, so sysadmin-cloud would be able to run euca-* and nova commands. Can someone send me current source of rbac-playbook so I can base it on that please. I just find som e old version on seth site
19:40:40 <nirik> mirek: not a bad idea...
19:41:04 <nirik> we aren't setup to run nova commands from there... but the euca ones would work after you source a eucarc...
19:41:20 <nirik> wonder if that is possible to just do in sudo?
19:41:41 <nirik> ie "source this, then run command" ?
19:41:57 <mirek> source is not command
19:42:02 <mirek> it is bash internals
19:42:32 <nirik> yeah, but it looks like sudo you can pass a env_file for this.
19:42:33 <smooge> I normally write a shell script if I have to source stuff
19:42:33 <mirek> but it can be very similar to rbac-playbook, and easy, I just want to reuse some recent code
19:43:03 <mirek> http://skvidal.fedorapeople.org/misc/rbac-playbook I find just this
19:43:15 <nirik> right, I can send you the current one...
19:43:24 <nirik> it's pretty primitive tho.
19:43:29 <mirek> thanks
19:43:31 <nirik> for example, command line args aren't supported.
19:43:41 <nirik> which may break it for ec2 stuff.
19:43:47 <mirek> I will keep it primitive for sure :)
19:44:24 <nirik> ok, feel free to look, but I think env_file with the ec2rc and allowing euca* might be easier...
19:44:43 <nirik> thats just a change to sudoers
19:45:04 <nirik> or... hum.
19:45:38 <nirik> what if we add a acl to the ec2rc file to allow sysadmin-cloud to read it. Then you can just source it and run commands as you. They shouldn't need any privs
19:46:07 <nirik> it would mean all sysadmin-cloud folks would have the credientals
19:46:42 <mirek> ahh chacl(1) yes, that should work
19:47:01 <nirik> anyhow, will ponder on it and try and get something that works. ;)
19:47:28 <nirik> anything else sysadmin related?
19:47:31 * lbazan here late..
19:48:40 <nirik> #topic Upcoming Tasks/Items
19:48:41 <nirik> https://apps.fedoraproject.org/calendar/list/infrastructure/
19:48:55 <nirik> anything upcoming folks would like to note or schedule?
19:49:14 <nirik> I still haven't done much on FAD organizing. Hopefully more news by next week
19:49:24 <threebean> nirik: same here
19:50:03 <nirik> #topic Open Floor
19:50:17 <pingou> I have been playing with the lookaside cache
19:50:19 <nirik> anyone have anything for open floor? questions, comments, ideas, favorate pies?
19:50:29 <pingou> I was wondering how often we have 1 tarball with multiple md5
19:50:36 <pingou> the results are interesting: http://paste.fedoraproject.org/80881/39350540/
19:50:57 <nirik> wow.
19:51:07 <pingou> but that on all the current tree, so there are some old versions in there
19:51:10 <nirik> I wonder if thats indicative of uploads that fail...
19:51:25 <nirik> or if it's upstreams that change stuff and re-release.
19:51:34 <pingou> I'm afraid for the later
19:51:36 <threebean> holy..
19:51:53 <pingou> I'm not sure yet what to do with this, mail on devel, blog post?
19:52:13 <pingou> maybe it might be worth asking people to watch out for this
19:52:18 <nirik> I wonder if we could find out more by looking at commits on those spec files?
19:52:21 <pingou> accident happens but...
19:52:24 <smooge> how do they get 2 different md5s?
19:52:28 <mirek> What does that mean? E.tgz has 10 md5 sums -- does that mean that 10 packages have the same tar.gz?
19:52:35 <pingou> smooge: two different tarball with the same name
19:52:44 <pingou> mirek: yup
19:52:47 <smooge> ah ok.
19:52:51 <nirik> it means you upload foo-1.0.tar.gz
19:53:00 <nirik> then upload it again, but with a different md5
19:53:05 <smooge> ahhhhh
19:53:20 <pingou> http://pkgs.fedoraproject.org/lookaside/pkgs/389-admin/389-admin-1.1.12.tar.bz2/ for example
19:53:23 <misc> either upstream did it, which is bad
19:53:30 <smooge> so I guess a timestamp,md5sum would be needed
19:53:42 <misc> or someone did regenerate the tarball from git, this kind of stuff
19:53:42 <nirik> misc: yeah, but does sadly happen
19:53:49 <pingou> smooge: we kinda have the timestamp on the apache page ;-)
19:53:57 <misc> or someone modified the tarball, cause patch is too mainstream :)
19:54:03 <pingou> misc: regenerate the tarball w/o renaming it
19:54:25 <nirik> I wonder, could we grab all those, then unpack and diff -Nur on them to see how they are different? I guess so, but might take a long time to figure out all of them.
19:54:38 <pingou> 5569 packages had multiple md5 for at least 1 of their version
19:54:43 <pingou> might be a little much :)
19:54:58 <misc> nirik: skip texlive, this will reduce the time to see :)
19:55:01 <pingou> nirik: smooge but that's the output from my demand from yesterday (install tree on pkgs01)
19:55:04 <pingou> :)
19:55:15 <smooge> so what is the problem? the build system grabs the wrong one? we are worried people are uploading different ones
19:55:36 <pingou> smooge: the build system will grab whatever is in the source file, so we should be fine there
19:55:36 <nirik> pingou: I'd say devel list I guess. Ask people if they are hitting upload issues (which we could try and fix) or other?
19:55:49 <pingou> it's more about packager/upstream behavior
19:55:52 <nirik> well, if it's a upload issue, we should try and fix it.
19:56:01 <nirik> if it's a upstream issue, we should be very sad, but ok.
19:56:09 <nirik> if it's a packager issue, we should tell them not to do that. ;)
19:56:21 * pingou is on the list
19:56:43 <smooge> pingou, but the source file says 389-admin-1.1.12.tar.bz2 and the lookaside cache has 3 of them
19:56:58 <nirik> smooge: the sources file has md5 too
19:57:03 <pingou> smooge: 3 different md5, and the source file has the md5
19:57:07 <smooge> duh
19:57:09 <smooge> thanks
19:57:11 <pingou> :)
19:57:19 <smooge> I deal with budgets and PPC for a week
19:57:31 <pingou> you're in pretty good shape then! :D
19:57:32 <smooge> well mostly budgets
19:57:34 <nirik> anyhow, perhaps devel list and hope we can get folks interested in investigating more so we don't have to?
19:57:46 <pingou> nirik: ok we'll do that :)
19:58:04 <nirik> crowdsource all the things! :)
19:58:26 <pingou> nirik: but I doubt texlive are bad upload, I'm pretty sure they are small sources :)
19:58:46 <nirik> oh... I wonder if that data would be nice too... size ?
19:58:53 <pingou> ?
19:59:02 <nirik> because if there are 4 of them and 3 of them are really small, it sounds like an upload problem?
19:59:19 <smooge> i was thinking spec file
19:59:22 <nirik> if all are close to the same size, it sounds more like upstream re-released or packager messed up
19:59:56 <nirik> pingou: so, for each of those on your list, a 'ls -l' of the same md5sum one?
20:00:09 <nirik> ls -lR
20:00:13 <smooge> actually even then it could be a bad upload. We had someone complaining a while back and it turned out about being a bad proxy in front
20:00:31 <nirik> sure, but it might give some more indications.
20:00:38 <pingou> nirik: good idea
20:01:38 <nirik> ok, if nothing else, will close out in a minute.
20:02:03 <willo> quick update from me
20:02:28 <nirik> willo: sure, whats up?
20:02:42 <willo> I'm about half way through collating a list of networks before I start on the diagrams
20:03:10 <danofsatx-work> willo: sorry I dropped off on helping you with this - school became harder than I anticipated for this semester.
20:03:22 <nirik> cool. please do ask me if you have questions.
20:03:29 <willo> i'll have something shortly to get input on assumptions
20:03:42 <willo> assumputions about purpose etc
20:03:46 <danofsatx-work> but things are stabilizing, so feel free to ping me for anything
20:03:47 <willo> nirik: no prob
20:04:00 <willo> danofsatx-work: no probs will do
20:04:18 <nirik> great. ;)
20:04:36 <willo> listing it out in spreadsheet and i'll stick it up on fedorapeople and ping mailing list
20:04:48 <nirik> ok, thanks for coming everyone! Lets get back to it in #fedora-admin, #fedora-apps, #fedora-noc.
20:04:55 <nirik> willo: sounds goodly.
20:05:11 <nirik> #endmeeting