infrastructure
LOGS
19:00:00 <nirik> #startmeeting Infrastructure (2013-02-14)
19:00:00 <zodbot> Meeting started Thu Feb 14 19:00:00 2013 UTC.  The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:00:00 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
19:00:01 <nirik> #meetingname infrastructure
19:00:01 <zodbot> The meeting name has been set to 'infrastructure'
19:00:01 <nirik> #topic greetings and felicitations
19:00:01 <nirik> #chair smooge skvidal CodeBlock ricky nirik abadger1999 lmacken dgilmore mdomsch threebean
19:00:01 <zodbot> Current chairs: CodeBlock abadger1999 dgilmore lmacken mdomsch nirik ricky skvidal smooge threebean
19:00:08 * skvidal is here ish
19:00:23 * puiterwijk 
19:00:29 * lmacken ish
19:02:06 <nirik> ok, lets go ahead and dive in then
19:02:08 <nirik> #topic New folks introductions and Apprentice tasks.
19:02:16 <nirik> any new folks around? or apprentices with questions?
19:02:21 <knesenko> Hi all
19:02:26 <knesenko> I am new here
19:02:30 * pingou 
19:02:48 <smooge> is here
19:02:52 * abadger1999 here
19:02:56 * Adran is here
19:03:04 <Adran> nirik: you answered my questions at least most of them this morning though. :)
19:03:11 <nirik> welcome knesenko. Care to introduce yourself? Are you more interested in sysadmin or application devel stuff?
19:03:16 <nirik> Adran: cool. ;)
19:03:21 <Ramesh_> hi, i am new here
19:03:52 <nirik> welcome Ramesh_
19:04:16 <knesenko> Hi all. My name is Kiril. I have good exp. in Linux systems, RPM packaging etc .. . I am interested to maintain koji build systems.
19:04:57 <Ramesh_> Thanks nirik, i was in the infrastructure group before but due to acedemic load i was not able to contribute much
19:05:06 <knesenko> send "Hello" email to the infra email list last week .
19:05:26 <nirik> knesenko: welcome. not sure how much work the buildsys needs, but I'm sure we can try and find you something to work on...
19:05:44 * ianweller here
19:05:45 <nirik> Ramesh_: no worries. ;) are you more interested in sysadmin or application devel?
19:05:46 <knesenko> nirik: sysadmin is good as well :)
19:05:48 <pingou> copr?
19:06:04 * skvidal +1's that
19:06:11 <skvidal> we could use people working on copr
19:06:17 * nirik nods. :)
19:06:29 <skvidal> knesenko: it isn't koji - but it is a different buildsys we're working on
19:06:41 <skvidal> http://fedorahosted.org/copr
19:06:54 <Ramesh_> yeah i am interested in system admin stuff
19:06:55 <skvidal> knesenko: if you're interested  - come by #fedora-apps
19:07:07 <skvidal> cccccccbjhtnjvrvvflvcvngijtcvrefrucenfibfvkb
19:07:11 <skvidal> hey look - my yubikey fired
19:07:12 <Adran> skvidal: cat?
19:07:15 <nirik> Ramesh_: cool. See me after the meeting in #fedora-admin and we can get you started. ;)
19:07:19 <Adran> oh even better, skvidal
19:07:33 <knesenko> skvidal: everything that is related to build systems I am interested :)
19:07:40 <Ramesh_> nirik: cool :)
19:07:46 <skvidal> knesenko: great - then take a look at what copr is for
19:07:50 <pingou> knesenko: then you will like copr :)
19:07:52 <skvidal> knesenko: and come by and talk to us if you're interesting
19:07:53 <nirik> excellent. Any other new folks or general questions?
19:07:58 <skvidal> s/intetresting/interested/
19:08:12 <knesenko> skvidal: pingou np 10x
19:08:31 <nirik> #topic Applications status / discussion
19:08:41 <nirik> ok, any applications news this week or upcoming?
19:08:49 <pingou> I've started to work on a refresh of pkgdb
19:08:49 <puiterwijk> yeah, three openid related ones from me
19:09:01 <pingou> the db scheme will change and is simplified
19:09:40 <nirik> #info pingou working on a pkgdb update with db schema changes
19:09:49 <puiterwijk> as you might have read, we have authopenid for trac working now
19:10:05 <nirik> puiterwijk: we need to package that up still right?
19:10:08 <puiterwijk> (Seth sent an email out)
19:10:16 <puiterwijk> nirik: yeah, will do so later today or tomorrow
19:10:32 <nirik> #info authopenid testing with trac seems to show it works.
19:10:38 <abadger1999> I htink we've finally gotten rid of all the bugs in the python-fedora+python-requests update.  But we're running one hotfix in infrastructure relatedto that.  Unless we find more serious bugs, I'll hold off on another upstream release until after fedora/epel6 get the update that's currently pending.
19:11:00 <pingou> abadger1999: nice
19:11:02 <puiterwijk> second OpenID related item: flask_fas_openid base is done, I only need to implement one last extension (CLA), and then we can start moving over Flask apps to FAS-OpenID
19:11:18 * abadger1999 thanks threebean for tracking down a solution for that last problem.
19:11:23 <nirik> abadger1999: is that one hotfix something others will hit when the current update goes stable? or it's likely only something that matters to us?
19:11:50 <abadger1999> nirik: Yes, but I'm hoping it won't be as bad for them.  It's a *severe* performance regression.
19:11:57 <nirik> #info flask-fas-openid base is almost ready for use. Flask apps can then use fas-openid
19:12:04 <tflink> as a heads up, the new API for bugzilla has been deployed to partner-bugzilla. the new python-bugzilla (git master HEAD) is needed to interface with the API changes and has changed a bit
19:12:20 * tflink sent an email to infrastructure@ but figured it would make sense to mention here
19:12:28 <nirik> tflink: yeah. ;( we need to test our stuff...
19:12:30 <abadger1999> If you download a large json dataset from one of our apps, then python-requests attempts to detect the character encoding which will take a very long time.
19:12:49 <puiterwijk> also, FAS-OpenID 0.5 will get into staging too when I have the CLA extension working, and when it is I will request testing on the infra mailing list
19:13:09 <tflink> it broke a bunch of stuff in the blocker tracking app but I needed to refactor that code anyways
19:13:15 <nirik> #info partner-bugzilla has been updated. git HEAD python-bugzilla needed. We need to test all our bugzilla using applications against the new versions.
19:13:18 <abadger1999> We're mostly the ones that are doing that since we cron jobs/supybot  that consume data about all of our packages/users/etc.
19:13:55 <nirik> tflink: thanks for the heads up
19:14:10 * relrod here, late.
19:14:21 <pingou> what about a pre-release python-bugzilla
19:14:28 <tflink> nirik: np, we still don't know when the changes are going to be pushed to production, though
19:14:35 * tflink is waiting to hear back on that
19:14:46 <nirik> pingou: we should make one or ask for one, yeah
19:14:56 <tflink> pingou: I asked for one, should be done today sometime
19:15:20 <pingou> tflink: nice!
19:15:24 <pingou> thanks
19:15:34 <nirik> excellent.
19:16:01 <threebean> askbot in stg should be good to go.
19:16:12 <nirik> threebean: cool. when does it message?
19:16:21 <puiterwijk> oh, and another request with respect to FAS-OpenID: if you tested it, please report at least failures, but also successes would be nice
19:16:24 <nirik> asking new questions, proposed answers?
19:16:35 <nirik> #info feedback wanted on FAS-OpenID
19:16:47 <nirik> #info askbot in stg is fedmsg aware now
19:16:51 <threebean> nirik: new question, proposed answer, and a few others (flagged messages as offensive)
19:17:04 <nirik> ok
19:17:44 <nirik> ok, any other applications news?
19:17:58 <pingou> I need to move on with fedocal
19:18:03 <nirik> abadger1999: when did you want to move that pkgdb update? :)
19:18:12 <pingou> I'll have the spec file ready by this week-end
19:18:32 <nirik> pingou: cool. Happy to help you with review, etc
19:18:35 <abadger1999> nirik: I'm debating waiting for pingou's schema changes to land now.
19:18:47 <nirik> ok
19:18:54 <pingou> abadger1999: no go ahead
19:19:02 <abadger1999> nirik: It's pretty close and it would be nice to bring the db size down.... Should make backups faster :-)
19:19:09 <pingou> abadger1999: that retires the apps part and we already clean up the db a bit
19:19:19 <nirik> ok.
19:19:23 <abadger1999> and I've got plenty of other releases/fedora spec file cleanups to work on :-)
19:19:30 <abadger1999> pingou: ah.  Okay.
19:19:48 <abadger1999> nirik: tentatively end of next week, then.
19:20:00 <pingou> abadger1999: I see more the changes I'm doing atm for a new(er) version of pkgdb
19:20:06 <abadger1999> k
19:20:09 <nirik> ok, we don't need an exact schedule right now... just enough in advance so we can schedule any outages.
19:20:17 <abadger1999> <nod>
19:20:30 <nirik> ok, shall we move on then?
19:20:36 <abadger1999> nirik: Shouldn't be any outage but there will be a change in what people can get from pkgdb afterwards
19:20:45 <abadger1999> appdb goes away, tags from pkgdb go away.
19:20:58 <nirik> #topic Sysadmin status / discussion
19:21:03 * abadger1999 makes a note to check if rel-eng processes are pulling tags from pkgdb or tagger now
19:21:11 <nirik> abadger1999: yeah, good to check on.
19:21:20 <nirik> so, lets see... on the sysadmin side...
19:21:34 <nirik> #info Rework of nagios underway. Phase 1 complete.
19:21:44 <nirik> I redid nagios dependencies
19:21:50 <nirik> so they should now be right.
19:21:58 <nirik> and we shouldn't get 20 pages when a site is down
19:22:35 <nirik> next phase(s) are to change the alerts so we only get urgent on impacting outages, and fix it so we can have other groups we monitor get notices for their stuff only.
19:22:56 <nirik> side note: can we switch nagios to use openid too? it's using mod_auth_pg as well right now
19:23:10 <puiterwijk> nirik: can add to my todo-list?
19:23:15 <nirik> puiterwijk: that would be great.
19:23:46 <nirik> we had a nasty serverbeach outage eariler in the week as well as some hosted instability.
19:23:55 <puiterwijk> #action puiterwijk will look into switching nagios to openid from mod_auth_pg
19:24:22 <nirik> smooge is going to be doing a quick visit out to our phx2 datacenter next week.
19:24:37 <nirik> If there are things people can think of that can only be done on-site, please let him know. ;)
19:24:43 <puiterwijk> yeah, hosted was probably gluster not playing nice
19:24:59 <skvidal> puiterwijk: all we really need for nagios (and for epylog logs) is some way of doing normal apache auth to openid
19:25:05 <nirik> I also adjusted hosteds robots.txt
19:25:12 <smooge> I will be mostly working on getting some hardware rebuilt
19:25:19 <puiterwijk> skvidal: so apache mod_openid?
19:25:26 <nirik> #info smooge on site next week (mon/tue).
19:25:40 <skvidal> puiterwijk: is that actually being maintained?
19:25:41 <nirik> #info robots.txt on fedorahosted adjusted to prevent crawling load issues.
19:25:50 <puiterwijk> skvidal: I have no idea yet :)
19:26:07 <skvidal> nirik: want to go over the cloud upgrade 'fun'
19:26:08 <skvidal> ?
19:26:12 <puiterwijk> skvidal, nirik: just one issue though...
19:26:13 <nirik> yeah, next up...
19:26:20 <nirik> puiterwijk: ?
19:26:30 <puiterwijk> if we would switch nagios to use openid, and openid would go down, we would also lose access to the web interface of nagios...
19:26:36 <skvidal> puiterwijk: agreed
19:26:41 <nirik> true.
19:26:53 <nirik> but if postgres goes down, we already die. ;(
19:26:54 <skvidal> it's one reason why I like the basic auth we're using for epylog
19:26:57 <skvidal> nirik: ^^^
19:27:03 <nirik> yeah.
19:27:04 <skvidal> we just use a .htpasswd file
19:27:10 <skvidal> that is... unlikely... to fail
19:27:18 <puiterwijk> yeah, quite
19:27:34 <puiterwijk> just wanted to note it before I go to deep into mod_auth_openid
19:27:42 <skvidal> puiterwijk: it's a good note
19:27:45 <nirik> well, just a thought. Let me ponder on it some more. Ideally I'd prefer nagios (and epylogs too) to be available to lots of people, but just not crawlers or the world...
19:28:01 <skvidal> nirik: maybe multiple-auth?
19:28:04 <skvidal> nirik: fallthrough?
19:28:11 <nirik> possibly...
19:28:11 <puiterwijk> nirik: maybe failback ?
19:28:13 <skvidal> it depends on how mod_openid fails
19:28:18 <nirik> yeah, not sure off hand.
19:28:29 <skvidal> and, again, if it is reliable or maintained
19:28:31 <puiterwijk> I will check into mod auth_openid
19:28:34 <nirik> openid, then if fails a fallback .htpasswd with core people.
19:28:41 <skvidal> nirik: ya
19:28:48 <nirik> could work.
19:29:02 <nirik> ok, any other sysadmin stuff? or shall we move on to cloudy fun?
19:29:15 <nirik> #topic Private Cloud status update / discussion
19:29:25 <nirik> so, we updated both our cloudlets this week.
19:29:35 <nirik> The openstack one seems to have gone pretty well.
19:29:41 <smooge> have to head to vet
19:29:41 <nirik> the euca side... not so well
19:29:42 <smooge> bbs
19:29:49 <skvidal> the euca cloudlet.... had issues
19:29:49 <nirik> smooge: safe travels
19:30:04 <skvidal> I've been helped by the euca team today to determine what went wrong
19:30:30 <skvidal> one issue was caused by my running: service eucalyptus-cc stop instead of service eucalyptus-cc cleanstop
19:30:36 <skvidal> the other issue is unclear.
19:30:51 <skvidal> apparently 'cleanstop' is similar to --really-force
19:30:53 <nirik> stop means 'uncleanly shutdown' ? thats odd...
19:31:00 <skvidal> nirik: well stop means
19:31:05 <skvidal> 'stop but maintain all state'
19:31:11 <nirik> ah.
19:31:12 <skvidal> cleanstop means stop but nuke all state and start fresh
19:31:29 <nirik> which you want on upgrades...
19:31:33 <skvidal> umm
19:31:38 <skvidal> that's where it gets weird
19:31:57 <skvidal> I've asked this question if there is ever a time on an upgrade (where you would NOT want to cleanstop)
19:32:00 <skvidal> and the answer is no
19:32:08 <skvidal> so, since the upgrade modifies the db
19:32:19 <skvidal> I am unclear on why it doesn't clean the state then
19:32:26 <nirik> yeah. ;(
19:32:41 <skvidal> anyway... I have some more tests to run but as of this moment things are more or less working in the euca cloudlet
19:32:45 <nirik> so where do we stand right now? instances are up and working, but the cloudlet is not stable/reliable?
19:32:53 <skvidal> instances are up and working
19:32:59 <skvidal> volume attachment is what I have to test
19:32:59 <pingou> fedocal seems to have some data corrupted
19:33:04 <skvidal> pingou: corrupted?
19:33:06 <skvidal> pingou: that's new
19:33:09 <skvidal> pingou: corrupted where?
19:33:10 <pingou> files of size 0
19:33:17 <pingou> skvidal: /srv/persist/fedocal
19:33:30 <skvidal> pingou: that's new.
19:33:36 <skvidal> pingou: what files?
19:33:43 <pingou> all I think
19:33:49 <skvidal> pingou: can I look?
19:33:53 <pingou> of course
19:33:54 * nirik plays the sad trombone. ;(
19:34:18 <pingou> skvidal: just don't mind the cat sitting on /srv/
19:34:27 <skvidal> pingou: yes
19:34:32 <skvidal> anyway
19:34:39 <skvidal> the euca cloudlet is not a happy place right now
19:34:54 <skvidal> at the same time I am trying to get the ssl'd ec2 api interface working for openstack
19:35:00 <skvidal> all the packets are getting through
19:35:05 <skvidal> but we are seeing this error
19:35:14 <skvidal> 2013-02-14 19:23:54  WARNING [keystone.common.wsgi] Authorization failed. EC2 signature not supplied. from 127.0.0.1
19:35:21 <nirik> #info cloudlet upgrades this week. openstack did ok, euca didn't upgrade nicely at all
19:35:53 <skvidal> it seems like some sort of signature is getting stripped out (or never supplied) with the ec2 auth
19:35:56 <nirik> so, if we get that working we can I hope spin up the volumes/instances in the OS side and reinstall the other cloudlet if we like.
19:36:03 <skvidal> if we do not pass through nginx this works
19:36:06 <skvidal> so it may be happening there
19:36:12 <skvidal> nirik: agreed
19:37:13 <nirik> #info work ongoing to get ec2 ssl working on openstack cloudlet
19:37:22 <nirik> ok, anything further cloudside?
19:37:52 <skvidal> not at the moment
19:37:56 <nirik> #topic Upcoming Tasks/Items
19:38:07 <nirik> lets see if this floods me off irc:
19:38:15 <nirik> #info 2013-02-18 to 2013-02-19 smooge on site at phx2.
19:38:15 <nirik> #info 2013-02-28 end of 4th quarter
19:38:15 <nirik> #info 2013-03-01 nag fi-apprentices
19:38:15 <nirik> #info 2013-03-07 remove inactive apprentices.
19:38:15 <nirik> #info 2013-03-19 to 2013-03-26 - koji update
19:38:17 <nirik> #info 2013-03-29 - spring holiday.
19:38:19 <nirik> #info 2013-04-02 to 2013-04-16 ALPHA infrastructure freeze
19:38:20 <nirik> #info 2013-04-16 F19 alpha release
19:38:22 <nirik> #info 2013-05-07 to 2013-05-21 BETA infrastructure freeze
19:38:24 <nirik> #info 2013-05-21 F19 beta release
19:38:26 <nirik> #info 2013-05-31 end of 1st quarter
19:38:28 <nirik> #info 2013-06-11 to 2013-06-25 FINAL infrastructure freeze.
19:38:30 <nirik> #info 2013-06-25 F19 FINAL release
19:38:32 <nirik> anything else anyone would like to schedule or add/note?
19:38:43 * nirik notes as soon as we have fedocal I can just stick this all in there. ;)
19:38:49 <skvidal> nirik: :)
19:38:55 <sontek> Not sure if this would fit in meeting related topics, but there is a PyCon booth we'd like to give to Fedora if anyone is attending PyCon
19:39:09 <nirik> sontek: awesome. ;)
19:39:11 <puiterwijk> nirik: maybe we should start thinking about planning to move FAS-OpenID to prod?
19:39:27 <nirik> abadger1999 / lmacken / threebean I think are going to pycon? possibly more from our team?
19:39:56 <nirik> puiterwijk: yeah, I'd like to have it in prod.
19:40:11 <puiterwijk> nirik: any idea for what kind of schedule would fit?
19:40:33 <puiterwijk> abadger1999: was the FAS login(username, password, yubikey) function written already?
19:40:44 <abadger1999> nirik: Yeah -- I'll be there.  I'm trying to hook sontek up with one of the ambassadors who are attending as they'll be quite happy to make use of it I think.
19:40:47 <nirik> well, not sure. I'm open to ideas.
19:40:54 <nirik> abadger1999: excellent.
19:41:11 <puiterwijk> abadger1999, nirik: or do we want to hold of on yubikey in the first version of FAS-OpenID for now?
19:41:55 <sontek> Jesse Noller just wants to know who will be doing the booth so he knows it wont be empty, I can definitely help out at the booth but wouldn't want to be the main person for it
19:42:07 <nirik> well, if it's not going to be much longer to just implement it would be good to get done before prod. If it will be a while, I'm fine waiting for another release later.
19:42:17 <abadger1999> puiterwijk: It wasn't written yet.  No reason to hold back on it assuming we can get it written.
19:42:49 <puiterwijk> abadger1999: what do you mean? to just release it without for the moment, and add it in 1.1?
19:43:04 <puiterwijk> (or 2.0, in the Firefox version-hell spirit :))
19:43:28 <nirik> yeah, if it will be a while I'm fine with a 1.0 now and future release with that.
19:43:39 <nirik> #topic Open Floor
19:43:45 <nirik> anyone have any items for open floor?
19:43:52 <nirik> questions? suggestions?
19:43:58 <pingou> I had, I just have to remember what it is
19:44:11 <nirik> :)
19:44:24 <abadger1999> puiterwijk: I would say, don't let lack of yubikey block us from releasing but If we can write the fas method you need in time, there's no reason to hold it back either
19:44:52 * nirik is in agreement with abadger1999. as usual. ;)
19:45:27 <puiterwijk> abadger1999: okay. I will check quickly how much time it'll take
19:45:34 <pingou> nm, it doesn't want to come back, if I remember it, I'll mention it
19:45:40 <nirik> no worries.
19:46:22 <nirik> ok, if nothing more will close out in a few..
19:46:32 <nirik> oh, forgot to mention arm boxes...
19:46:49 <jsmith> nirik: I was just going to ask about that :-)
19:47:12 <nirik> we have arm boxen in phx2. Racked. Powered. Serial console all setup. 1 of the 4 of them have ip's and I've installed the 24 socs for arm builders.
19:47:31 <nirik> need to reinstall them all, then do ansible config on them, then they should be ready to build.
19:47:39 <nirik> network is still pending for the other ones.
19:47:51 <jsmith> Any idea how long the network for the others might take?
19:48:08 <jsmith> (Don't get me wrong -- I'm happy to see the first 24 coming online)
19:48:16 <jsmith> Just curious, is all
19:48:41 <nirik> I've been harrassing people for a week+. ;) Our primary network contact is out, so we are trying to get someone else up to speed on the setup and requirements.
19:48:56 <nirik> hopefully soon, but I just don't know for sure. ;)
19:49:01 <jsmith> Fair enough
19:49:53 <nirik> ok, thanks for coming everyone. As always, continue in #fedora-admin, #fedora-noc and #fedora-apps
19:49:57 <nirik> #endmeeting