infrastructure
LOGS
19:00:01 <nirik> #startmeeting Infrastructure (2012-02-02)
19:00:01 <zodbot> Meeting started Thu Feb  2 19:00:01 2012 UTC.  The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:00:01 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
19:00:01 <nirik> #meetingname infrastructure
19:00:01 <zodbot> The meeting name has been set to 'infrastructure'
19:00:01 <nirik> #topic Robot Roll Call
19:00:01 <nirik> #chair smooge skvidal Codeblock ricky nirik abadger1999 lmacken dgilmore mdomsch
19:00:01 <zodbot> Current chairs: Codeblock abadger1999 dgilmore lmacken mdomsch nirik ricky skvidal smooge
19:00:16 * skvidal is here
19:01:05 <nirik> I'll wait a few and start after folks drift in...
19:03:16 <nirik> ok... I guess lets dive in.
19:03:18 <nirik> #topic New folks introductions and apprentice tasks/feedback
19:03:26 <nirik> any new folks around who would like to say hi?
19:03:42 <nirik> or apprentices have questions/concerns/notes?
19:03:47 <Southern_Gentlem> hello
19:03:49 * abadger1999 here but not new ;-)
19:03:56 * jsmith lurking, but not new either
19:03:57 * nirik just sent out the monthly apprentice ping email.
19:04:02 <abadger1999> Southern_Gentlem: Greetings :-)
19:04:40 * jds2001 says hi :)
19:04:42 <nirik> welcome Southern_Gentlem
19:05:44 * wsterling here
19:05:50 <nirik> hey wsterling
19:06:14 <nirik> ok, moving along then...
19:06:25 <nirik> #topic two factor auth status
19:06:33 <nirik> I don't see herlo around...
19:06:56 <nirik> I'm going to try next week and see what I can do to move this forward. I'd like to see us get to the first planned thing soon.
19:07:04 <dgilmore> hey yall
19:07:11 <nirik> (which is sudo for sysadmin-main using 2 factor)
19:07:15 <nirik> hey dgilmore
19:07:34 <nirik> #topic Staging re-work status
19:07:42 <nirik> so averi has been working on this.
19:08:11 <nirik> we have been testing each of our stg machines and seeing what needs to be fixed when we point them at the master branch in puppet instead of the staging branch.
19:08:39 <nirik> however, a implementation issue/question has come up with respect to the apps... how to organise the stg config in the master branch.
19:09:00 <nirik> this might be something for the list, but thought I would bring it up here.
19:09:38 <nirik> The options seem to be:
19:10:09 <nirik> a) any files that are different between stg and prod get conditionals and their own copy of the file in .stg.
19:10:36 <nirik> b) we copy everything to another tree for the application (ie, modules/bodhi/ vs modules/bodhi.stg/
19:10:41 <nirik> b is a lot of changes tho.
19:11:08 <skvidal> well b is true for any app we're deploying twice in 2 different configurations
19:11:15 <skvidal> for example: rsync
19:11:24 <skvidal> we can have an 'rsyncd' module
19:11:32 <skvidal> but if our configurations are wildly different enough
19:11:40 <skvidal> it will be a pain to keep both of those in one module
19:11:55 <nirik> the rsync stuff is more like plan a) above I think.
19:11:56 <skvidal> when, in reality, we end up needing rsyncd-download and rsyncd-projects or some-such thing for modules
19:12:02 <skvidal> the rsync stuff, currently
19:12:04 <skvidal> but another example
19:12:05 <skvidal> httpd
19:12:12 <skvidal> httpd really is more like b
19:12:29 <skvidal> we have a daemon which is going to provide divergent services with divergent configs
19:12:40 <jds2001> why should httpd be radically different?
19:12:41 <skvidal> putting them all in one module is just a recipe for confusion and frustration
19:12:55 <skvidal> jds2001: have you looked at our websites/httpd module layout?
19:13:02 <skvidal> ever tried to follow it back? it's a nightmare
19:13:22 <skvidal> anyway
19:13:28 <skvidal> my point is more like this
19:13:30 <jds2001> yeah, but it winds up  being very modular, which i think is a good thing
19:13:41 <skvidal> jds2001: modular in much the same way that atoms are modular
19:13:43 <jds2001> but yeah, following it all back is living hell
19:13:44 <skvidal> yes, you can build anything
19:13:50 <skvidal> but you have to know WAY too much about it
19:14:09 * jds2001 wont argue that :)
19:14:09 <smooge> ugh sorry..
19:14:14 <skvidal> anyway - if bodhi in staging is going to be a major diverging point
19:14:26 <nirik> perhaps what would be good is to post the changes we have on say app01.stg and have people propose their puppet solution. ;)
19:14:26 <skvidal> then I say just make a separate module dir
19:14:33 <skvidal> bodhi.stg or bodhi.tng
19:14:37 <nirik> but it's not very different.
19:14:45 <skvidal> if it is only going to be config changes
19:14:50 <skvidal> but they are across a lot of files
19:14:52 <skvidal> then i'd say think about
19:15:05 <skvidal> bodhi/files/staging/somedir <-- and using recurse
19:15:14 <skvidal> and bodhi/files/production/somedir <-- and using recurse
19:15:36 <skvidal> ie: if it is all of the same piece - w/ different configs - then all in one module
19:15:55 <skvidal> if it is not really the same service/implementation anymore then separate modules
19:15:57 <nirik> it's more just bodhi.cfg
19:16:13 <skvidal> then why is that difficult?
19:16:15 <skvidal> why not just
19:16:19 <skvidal> bodhi.cfg.$hostname
19:16:24 <skvidal> or bodhi.cfg.stuff
19:16:32 <skvidal> have bodhi.cfg be 'production'
19:16:37 <skvidal> and everything else is a modification around that
19:16:50 <nirik> yeah, that was my thought, but not sure if it works for application developers. ;)
19:17:02 <skvidal> which part?
19:17:07 <skvidal> and more to the point
19:17:13 <nirik> we did this with the postfix configs for stg hosts... added a conditional and stg hosts get foo.stg config.
19:17:17 <nirik> abadger1999: ?
19:17:32 <skvidal> I thought our app devels were going to start treating the apps more like apps
19:17:35 <skvidal> and less like configs?
19:17:53 <nirik> hum/
19:17:53 <jds2001> those apps need to e configured, no? :)
19:17:55 <nirik> ?
19:18:14 <skvidal> jds2001: that's not the same thing as hotfixes and patching
19:18:21 <jds2001> right.
19:18:37 <jds2001> to me, a hotfix should be a new rpm
19:18:46 <skvidal> nirik: I thought the discussion from fudcon resolved out to:
19:18:51 <skvidal> 1. everything is production
19:19:02 <nirik> if we change just cfg to have foo.cfg and foo.cfg.stg that works fine until you need to change something in a manifest, in which case you need to add it with a conditional for stg... which I don't know is that bad really...
19:19:04 <skvidal> 2. we have some boxes called 'staging' but really all they are is production boxes for developers
19:19:12 <nirik> right.
19:19:30 <skvidal> have we actually encountered the manifest conditional issue, yet?
19:19:38 <nirik> so it's the same apps but with different config or versions or changes...
19:19:56 <nirik> well, we are using it in the postfix stuff now I think...
19:20:07 <skvidal> nirik: explain?
19:20:09 <abadger1999> if we have separate files for bodhi.cfg vs bodhi.cfg.stg; we'll need to have conditional manifests.
19:20:18 <skvidal> abadger1999: ???
19:20:29 <skvidal> we need to have source=
19:20:31 <skvidal> have a list
19:20:38 <skvidal> but it's not CONDITIONAL
19:20:41 <abadger1999> since we'll want a different files depending on whether the host is on a stg box.  rght?
19:20:46 <skvidal> again
19:20:51 <skvidal> a source= fall through list
19:20:52 <nirik> yeah. so, look at:
19:21:06 <nirik> puppet/modules/postfix/manifests/init.pp
19:21:08 <skvidal> postfix, rsync, resolv.conf
19:21:16 <nirik> yeah.
19:21:30 <nirik> for staging hosts we set: postfix_group = stg
19:21:36 <nirik> so they get a staging config.
19:22:33 <nirik> but this only works for files, right?
19:22:47 <skvidal> umm this is unix - everything is a file :)
19:22:54 <abadger1999> Hmm... I'm leary of mixing this in there as well but I'm not sure we'll hit what I'm thinking of in practice.
19:23:00 <nirik> if stg needs a change from prod in say bodhi to work with a new version, that would need to be conditional.
19:23:01 <skvidal> abadger1999: huh?
19:23:02 <abadger1999> Say we have bodhi.cfg, bodhi.cfg.masher
19:23:13 <abadger1999> and then we need stg versions of both of those.
19:23:35 <abadger1999> What's the fallthru look like?
19:23:37 <nirik> yeah.
19:23:47 <skvidal> unless we're talking about 100 boxes
19:23:51 <skvidal> the fall through is
19:23:55 <skvidal> $hostname
19:24:03 <skvidal> $bodhi_group
19:24:10 <skvidal> bodhi.cf
19:24:12 <skvidal> g
19:24:14 <skvidal> or
19:24:21 <skvidal> $bodhi-group.$hostname
19:24:23 <skvidal> $hostname
19:24:27 <skvidal> $bodhi-group
19:24:30 <skvidal> bodhi.cfg
19:25:17 <skvidal> am I misunderstanding something?
19:25:34 <abadger1999> So bodhi group would be ('', 'masher', 'stg', 'masher.stg') ?
19:25:46 <skvidal> bodhigroup can be whatever ytou want
19:25:59 <skvidal> bodhigroup=toshiolovesgingerale
19:26:20 <skvidal> you just make a file corresponding to it
19:26:45 <nirik> I think the files part works fine with this, but the issue other changes in manifests, etc will need to be in a staging or host conditional... which means if you are not carefull you leak that change to production... but perhaps thats not such a big deal in practice.
19:27:04 <skvidal> nirik: I'm still confused by that
19:27:21 <skvidal> is there a change we normally make to enable/disable something that is not actually done in the config file
19:27:26 <skvidal> and then the service is reloaded?
19:27:31 <jds2001> why not make the conditional on bodhi-group
19:27:31 <skvidal> sorry 'notified'
19:27:37 <nirik> skvidal: so, say we have a new raffle app.
19:27:37 <skvidal> jds2001: indeed
19:27:41 <skvidal> nirik: okay
19:27:51 <nirik> we want to test out the new version. It needs 2 new packages installed.
19:28:06 <nirik> so, we add that to the raffle module, but we don't want the production one getting that yet.
19:28:17 <skvidal> okay
19:28:20 <nirik> so, we have to add it with 'if != staging' or whatever.
19:28:34 <skvidal> and that's onerous?
19:28:37 <nirik> then we test it out and get it working after some more changes.
19:28:44 <skvidal> and we clip the 'if'
19:28:50 <nirik> then we need to recall what things should be changed to make it alive in prod.
19:29:00 <skvidal> which are all inside the 'if'
19:29:11 <nirik> unless there's other things in there that should stay in stg.
19:29:37 <nirik> or someone forgets the if and adds it and it messes up production.
19:29:50 <skvidal> okay
19:29:59 <skvidal> I am failing to see how this is a problem that's unique to this situation
19:30:10 <skvidal> when we would migrate from master/staging with branches
19:30:12 <nirik> perhaps it's not.
19:30:13 <skvidal> we'd have all of this and more
19:30:19 <nirik> true.
19:30:30 <abadger1999> Well.. it's what hte solution looks like.
19:30:36 <skvidal> abadger1999: ??
19:31:46 <abadger1999> So if we had a separate module-level directory for bodhi, we could capture the persistent differences between production and stg in conditionals
19:31:55 * nirik notes we make poor use of comments in puppet files. Perhaps using them more would help this kind of thing. ;)
19:32:09 <abadger1999> And the testing-of-new-version changes would just be in the file itself.
19:32:34 <skvidal> abadger1999: okay...
19:32:36 <skvidal> then do that
19:32:45 * skvidal doesn't understand all the handwringing here..
19:32:46 <abadger1999> then when we merge from stg to production, or vice-versa, it would be a straight cp bodhi.stg/file bodhi/file
19:32:46 <nirik> right. It's a lot more change tho.
19:33:16 <abadger1999> So... it's just what it looks like/how we use it afterwards.
19:33:18 <jds2001> nirik: how so?
19:33:48 <nirik> because puppet hates duplicates... so all classes will need to also be renamed, right? and then conditional on those in the node files?
19:34:05 <nirik> bodhi::app::epelmasher vs bodhi-stg::app::epelmasher
19:34:44 <skvidal> nirik: I thought that was only true if you tried to include them both
19:34:50 * skvidal tests
19:34:52 <abadger1999> yeah, I did too.
19:34:54 <nirik> not sure.
19:35:07 <skvidal> easy to find out
19:35:12 * skvidal picks on his favorite fall-host
19:35:13 <nirik> but then how do you include one of the stg ones?
19:35:23 <skvidal> in the host manifiest
19:35:28 <skvidal> bodhigrp=staging
19:35:36 <skvidal> if $bodhgrp=='staging':
19:35:40 <skvidal> include bodhi.stg
19:35:44 <skvidal> else
19:35:48 <skvidal> include bodhi
19:35:49 <skvidal> fi
19:36:15 <nirik> so releng01.stg node file has:
19:36:17 <nirik> include bodhi::app::masher
19:36:21 <nirik> how does that translate?
19:36:27 <jds2001> yeah, puppet has no way of telling what's in your puppet code - only what tends up in a compiled catalog does it care about.
19:36:42 <skvidal> nirik: something I'm not sure of
19:36:43 <skvidal> if you can do
19:36:48 <skvidal> include $var::app::masher
19:36:50 <abadger1999> If we could do include bodhi$env that would be even easier.
19:36:55 <skvidal> abadger1999: testing
19:36:56 <abadger1999> yeah.
19:36:59 <jds2001> an dif someone does something stupid like tries o combine staging and production, they deserve the fail that awaits.
19:37:16 <jds2001> (and we want it to fail)
19:37:59 <nirik> anyhow, how about we investigate this more and go on with the meeting?
19:38:08 <skvidal> works for me
19:38:33 <abadger1999> <nod>
19:38:36 <nirik> if we can work out a way to have seperate dirs without massive churn it's fine with me.
19:38:44 <abadger1999> We don't have to make it perfect, just better than what we have now ;-)
19:38:47 <wsterling> I wanted to throw out something with the staging re-work but not how we are goign to organize fiels in Puppet.
19:38:59 <nirik> wsterling: go ahead...
19:39:34 <wsterling> As to the building out of infrastruture on-demand in staging there is a module that will allow Puippet to provision libvirt guests, https://github.com/carlasouza/puppet-virt#readme
19:39:51 <nirik> huh... interesting.
19:39:58 * skvidal vomits
19:40:04 <wsterling> Cheff is more mature in that area but it would require an entire shift which would be hard...
19:40:04 <skvidal> 1. we don't have 'images'
19:40:17 <skvidal> 2. we do not have templates
19:40:33 <skvidal> 3. I do not think we want to dig further into the hole which is puppet
19:40:37 <skvidal> this is just my opinion, though.
19:41:21 <nirik> we have been looking at various cloud/virt setups to see if they will help us out too.
19:41:30 <nirik> wsterling: thanks for the link.
19:41:46 <nirik> #topic Upcoming outages
19:41:55 <nirik> we have two outages tonight...
19:42:03 <nirik> #info download-i2 outage tonight.
19:42:12 <nirik> #info pkgs and koji outage tonight.
19:42:17 <nirik> Hopefully they will go smoothly.
19:42:32 <nirik> #topic Applications status / discussion
19:42:45 <nirik> abadger1999 / lmacken / threebean: any application news of note?
19:43:02 <abadger1999> New FAS will be coming out today.
19:43:06 <lmacken> new bodhi went out yesterday
19:43:16 <nirik> I was going to see if I could summarize the url thread and come up with something we can all agree on and move forward with.
19:43:30 <nirik> unless someone else would like to (that would be just fine with me too :)
19:43:36 <abadger1999> Cool.
19:44:07 <nirik> I played around with glusterfs some... if everyone could look at my mail and see if there are any other places where it might help us that would be great.
19:44:29 <abadger1999> If anyone is interested in working on FAS, I have a few i18n issues that should be interesting to work on.
19:45:01 <abadger1999> It could serve them well for working on i18n in other web projects as well.
19:45:15 <nirik> #info will try and finalize url scheme next week.
19:45:28 <nirik> #info please note uses for glusterfs on the list.
19:45:42 <nirik> #info some easyfix fas tickets available in the i18n space.
19:46:05 <abadger1999> s/easyfix/interesting/ :-)
19:46:13 <nirik> also next week I can look at spinning up stuff for packages production... would be good to move forward to prod
19:46:22 <nirik> sorry.
19:46:22 <nirik> #undo
19:46:22 <zodbot> Removing item from minutes: <MeetBot.items.Info object at 0x27162dd0>
19:46:32 <nirik> #info some interesting fas tickets available in the i18n space.
19:47:14 <nirik> ok, any other apps news? OH... I have one more.
19:47:27 <nirik> We talked about search engines again the other day.
19:47:49 <nirik> I noted that sphinx had a mw plugin and would be easy to setup for that... but harder for everything else.
19:48:04 <nirik> then we thought about xaipan... which is being used by tagger.
19:48:17 <nirik> Thoughts on trying to use xapian for all our searching?
19:49:42 <nirik> I'll likely start a list thread on that idea.
19:49:46 <abadger1999> I think, see how easy it is to administrate xapian+omega for searching mediawiki.
19:50:00 <abadger1999> Compare to ease of administrating sphinx-mw plugin.
19:50:05 <nirik> yeah.
19:50:07 <abadger1999> Pick the winner.
19:50:21 <nirik> we could probibly setup a test xapian box and crawl the wiki and see.
19:50:46 <nirik> I think sphinx will be easier for the wiki, but less easy for everything else.
19:51:11 <abadger1999> It's not impossible to change search technology in mw later.
19:51:15 <skvidal> ping: back on the include $var discussion - when we get a chance
19:51:43 <abadger1999> So if sphinx is a snap to setup, and maintain... I don't see a problem doing that until/unless we have the itch to deploy something more complex for all of fp.o
19:52:06 <abadger1999> Otherwise it'll never get done :-)
19:52:06 <nirik> well, the plugin still needs packaging, but yeah... we can see.
19:52:09 <nirik> yep.
19:52:17 <nirik> #topic Staging redux
19:52:26 <nirik> skvidal: ?
19:52:35 <skvidal> $var=foo
19:52:43 <skvidal> include "${var}::app"
19:52:44 <skvidal> works
19:53:08 <skvidal> you can see an example of this in torrent02.fedoraproject.org.pp in manifests/nodes
19:53:23 <skvidal> you have to have the braces and the double quotes
19:53:31 <skvidal> w/o the double quotes it won't expand it
19:53:39 <skvidal> and puppet will quietly not include it
19:53:45 <skvidal> (and it won't tell you anything about that either)
19:53:57 <nirik> neat
19:54:05 <skvidal> so, abadger1999 I think that's what you want
19:54:17 <abadger1999> <nod>
19:54:55 <abadger1999> wfm if it works for everyone else.
19:55:01 <nirik> so, with this setup we want bodhi.stg and bodhi to be exactly the same... except staging machines use bodhi.stg/
19:55:34 <nirik> so, when you make changes in stg it only affects stg and when you are done you can cp/rsync them to production with only your changes, right?
19:55:59 <skvidal> and I think if we decide any app configuration is sufficiently weird/diffferent that the above helps -we can do that
19:55:59 <nirik> and we want that setup for any of our apps we usually test in stg? or ?
19:56:11 <skvidal> but in the cases where it is just a config file (like postfix) we don't need to
19:56:15 <skvidal> we can just use fall-through files
19:56:49 <nirik> sounds good.
19:57:34 <nirik> #topic Open Floor
19:57:43 <nirik> anyone have anything for open floor?
19:58:05 <skvidal> euca and rhev
19:58:08 <skvidal> and $other things
19:58:17 <skvidal> last week I got to play with both eucalyptus and rhev
19:58:26 <skvidal> and they are definitely very different :)
19:58:44 <nirik> yeah.
19:58:58 <skvidal> eucalyptus might get us a place where we can treat more systems like what wsterling was recommending
19:59:16 <skvidal> but instead of using puppet, we';d be using the euca2ools
19:59:19 <nirik> what does it use as storage, btw? filesystem? lvm?
19:59:32 <skvidal> local + nfs + otherthings
19:59:45 <skvidal> you can have to talk to iscsi w/o any problem.
20:00:11 <nirik> ok.
20:00:30 <skvidal> I'm going to work on installing euca3 tomorrow
20:00:40 <skvidal> which is currently 'devel' but it is a slushie devel
20:01:06 <nirik> sounds good.
20:01:08 <skvidal> I talked to one of the euca devs for a while last friday and his recommendation is euca3 b/c the interface is better and a variety of features act more like they should
20:01:39 <skvidal> he also worked on a couple of ways to simplify deployment of new systems w/o having to rely on someone else's images
20:01:52 <skvidal> I talked about the ridiculousness of image creation on my blog last week
20:02:03 <skvidal> and it was followed up by a series of other comments/blogs
20:02:24 <skvidal> the reality is that creating images to use on instances in euca is stupidly hardly
20:02:40 <nirik> yeah, sounds like there might be progress on fixing it ?
20:02:43 <skvidal> and there is no legitimate reason for this complexity - it appears to be entirely amazon's fault
20:02:46 <skvidal> yeah
20:02:57 <skvidal> by amazon I mean - b/c of a desire to copy ec2
20:03:12 <skvidal> no one has been focusing on the local deployment/image creation mechanism
20:03:22 <skvidal> but in each and every case the admin maintaining the systems has either
20:03:31 <skvidal> 1. been using other people's images (which I cannot personally imagine)
20:03:47 <skvidal> 2. or hacking it up by themselves and figuring out the same 40 steps everyone, ultimately, takes
20:04:11 <skvidal> in the case of some red hatters they spent a lot of time writing boxgrinder to work around ec2/euca not allowing you to simply run the frelling installer
20:04:21 <skvidal> it is, on its best day, the dumbest thing I've ever seen
20:04:29 <skvidal> not their work
20:04:31 <skvidal> it's not dumb
20:04:43 <skvidal> that they had to do it - rather than the  cloud systems just fixing it is dumb
20:05:03 <nirik> well, hopefully it can get fixed in the right place... and you can just install whatever you want to install. ;)
20:05:06 <skvidal> yes
20:05:08 <skvidal> exactly
20:05:10 <skvidal> there is some progress
20:05:14 <skvidal> we'll see where that goes
20:05:30 <skvidal> I'm still using junk03 and 05 right now
20:05:48 <skvidal> and 02 is a rhev manager install - but it can be reinstalled - I have all my notes on it
20:05:50 <nirik> we should have some more machines there soon for testing.
20:05:56 <skvidal> junk04?
20:06:06 <nirik> and 3-4 new ones.
20:06:09 <skvidal> nirik: great
20:06:17 <skvidal> it would be great to do some torture tests on gluster
20:06:27 <nirik> I'm going to do some more gluster testing... yeah.
20:06:34 <nirik> in particular on slow links.
20:06:51 <skvidal> nod and I'd like tosee how well it performs with brutal writes
20:06:55 <skvidal> even over fast links
20:07:01 <skvidal> like lots and lots of small files
20:07:21 <skvidal> the case I'm thinking of is all the cache and other data that packages is going to be making
20:07:27 <skvidal> especially the git checkouts
20:07:28 <nirik> yeah, my test the other day is some virtuals here at home... but I can setup more on new junk boxes easily.
20:07:50 <nirik> for serving lots of small files, the docs suggest using the nfs frontend...
20:07:58 <nirik> but thats less than ideal for failover.
20:08:22 <nirik> anyhow, we are over time, anything else? or shall we call it a meeting?
20:08:57 <skvidal> I don't have anything else.
20:09:53 <nirik> thanks for coming everyone!
20:09:53 <nirik> #endmeeting