infrastructure
LOGS
20:00:54 <CodeBlock> #startmeeting infrastructure
20:00:54 <zodbot> Meeting started Thu Feb 17 20:00:54 2011 UTC.  The chair is CodeBlock. Information about MeetBot at http://wiki.debian.org/MeetBot.
20:00:54 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
20:01:02 <CodeBlock> Guess I'm doing this, this week. :(
20:01:11 <CodeBlock> #topic roll call
20:01:15 * averi is around
20:01:18 * ricky 
20:01:21 * CodeBlock 
20:01:22 * sijis is around
20:01:25 * ianweller waves
20:01:27 <CodeBlock> #chair ricky
20:01:27 <zodbot> Current chairs: CodeBlock ricky
20:01:28 <smooge> is here
20:01:31 <CodeBlock> yay
20:01:33 <CodeBlock> #chair smooge
20:01:33 <zodbot> Current chairs: CodeBlock ricky smooge
20:01:40 <smooge> #meetingname infrastructure
20:01:40 <zodbot> The meeting name has been set to 'infrastructure'
20:01:52 <smooge> #topic Agenda
20:02:14 <smooge> ok those with one please be open about it. Mine seems to be to create conspiracy theories
20:02:34 <averi> lol
20:02:42 <CodeBlock> what - topics to discuss?
20:03:08 <averi> I am here to talk about blogs, please enlight me when we are at it
20:03:28 <asrob> :)
20:03:31 <CodeBlock> I have two - 1) Do we want to look into yubikey auth on more of our infra?, and 2) Shall we look at what else needs done to kill nagios2 with fire and move to nagios 3?
20:03:47 <averi> and mailing lists cleanup as well on hosted
20:03:59 <smooge> Rebuild of publictest boxes
20:04:04 <smooge> and Removal of old sysadmins
20:04:13 <smooge> and Training program for sysadmins
20:04:17 <CodeBlock> ok
20:04:32 <smooge> #topic ) Do we want to look into yubikey auth on more of our infra?
20:04:39 <smooge> CodeBlock, <-
20:05:04 <CodeBlock> Alright - well... now that mmcgrath sent me a yubikey, I'm kind of growing attached to it :)
20:05:12 <CodeBlock> I know we support it in FAS and on people01 for ssh
20:05:28 <CodeBlock> Any thoughts on expanding it and letting more of our infra accept yubikey auth?
20:06:03 <smooge> well the question is how is it to be used.
20:06:31 <ricky> I'd prefer to have actual two factor auth over just using it as is.  Not sure if I prefer just password over just yubikey)
20:06:47 <smooge> make it two factor (eg a pass+yubiOTP) or one-factor (pass or yubOTP). use it for sudo?
20:06:56 <CodeBlock> two factor I agree with
20:07:39 <CodeBlock> sudo...possibly. It can't be used for sudo on people01 right now. I have tried.
20:08:17 * dgilmore shows up
20:08:20 <smooge> yeah that has been my only attempt also :)
20:08:29 <ricky> For sudo, it'd need to be implemented in a way that doesn't require it, which is probably easily doable with PAM.
20:09:01 <smooge> well to use a favorite Seth line: If its not required why use it?
20:09:36 <CodeBlock> But yeah. I was just curious if there was any interest in adding/allowing it throughout more of the infra. phuzion and I were talking about it (he has one too now), and we were just curious as to what the plan was
20:09:37 <ricky> Because all sysadmin-main people have one (with maybe one exception?)
20:10:10 <smooge> well the plan is "we need a plan"
20:10:23 <smooge> what systems should it be needed on. which systems should it not. why
20:10:41 <dgilmore> we did talk in the past of requireing sysadmin-main people to have to use it
20:10:46 <sijis> i would think db* servers would probably need it
20:10:50 <dgilmore> especially on pt boxes
20:11:04 <smooge> what is it meant to help protect. what is it not used for
20:12:20 <dgilmore> smooge: mmcgrath was scared that a keylogger or a carelessly typed passwd on a pt box wold give a non sysadmin-main person a sysadmin-main users passwd
20:12:27 <ricky> One option is required for sysadmin-main everywhere, another is just pt and other widely accessible machines.
20:12:34 <smooge> dgilmore, I agree on it
20:13:01 <dgilmore> smooge: so the idea was a way to require auth but make it more secure
20:13:26 * dgilmore uses his yubikey with bodhi
20:13:34 <CodeBlock> I'm almost inclined to like ricky's first option. Because it solves the issue of local keyloggers and such too
20:14:23 <dgilmore> the only issue i see is that if say im traveling and only have my phone i cant fix things
20:14:36 <dgilmore> because i dont have a working  way to use yubikey
20:14:41 <sijis> i'm not quite following - why would you want to use yubi on pt boxes?
20:14:44 <dgilmore> though the chances of that are slim
20:14:51 <CodeBlock> dgilmore: wifi tether to a laptop ;)
20:15:19 <ricky> sijis: Because we give sudo access on those to everybody
20:15:20 <dgilmore> sijis: if i sudo on a pt box, and you have a keylogger you have my passwd
20:15:54 <dgilmore> we should treat pt boxes as untrusted
20:16:05 <dgilmore> perhaps even hostile
20:16:27 <CodeBlock> If we require it everywhere (for -main at least) as I said it loves the problem of local keylogging too. It's a one time password - use it once and it will never work again.
20:16:32 <dgilmore> sijis: its easy to get pt box access. that includes sudo
20:16:49 <sijis> gotcha.
20:16:53 <smooge> we should trust every system as untrusted, and we should treat certain systems like pt/people/hosted as hostile
20:17:37 <smooge> but I come from a shoot once, ask question later background
20:17:47 <CodeBlock> How hard is it to deploy yubikey auth (no matter how we do it)?
20:18:16 <ricky> Pretty easy as is now
20:18:28 <ricky> Just some global pam configs and that'd be all.
20:18:42 <CodeBlock> That's what I figured
20:18:43 <CodeBlock> alright
20:19:41 <CodeBlock> Well - it's just something to think about, we don't have to (and won't) come to a conclusion this meeting, but it's just something I was wondering the state of
20:19:58 <smooge> oh sorry jumped the gun
20:20:46 <CodeBlock> I can send something to the list (still talking about yubikeys) and see what people think or something
20:21:31 <CodeBlock> as far as nagios 3 ... We have test nagios and zodbot on noc01.stg .. But nothing else that is on noc01....not sure what else needs tested. But ..
20:22:04 <CodeBlock> but nagios and zodbot work on el6, with no issues
20:22:08 <CodeBlock> nagios 3*
20:22:12 <CodeBlock> and our nagios configs
20:22:44 <marchant> i volunteered to do testing once a plan was put together
20:22:48 <marchant> for nagios
20:23:04 <marchant> smooge upgraded stg a while back I believe
20:23:17 <CodeBlock> Well - our configs all seem to work - I'm not entirely sure how much more testing can/needs done
20:23:23 <smooge> yes noc01.stg should be set up.
20:23:26 <marchant> ok
20:23:31 <smooge> marchant, you were going to make a testing plan
20:23:34 * dgilmore thinks we pull the trigger post freeze
20:23:35 <smooge> actually there is something
20:23:39 <CodeBlock> dgilmore: agreed
20:23:39 <dgilmore> rebuild as rhel6 and move
20:24:04 <smooge> marchant, I do need a test plan as I want to use it for other upgrades
20:24:45 <marchant> smooge: I did not realize you wanted me to create the plan
20:24:46 <smooge> eg yes we see that the configs didn't barf on reload. thats a check mark. but usability, webshots, what we tested we need
20:24:59 <smooge> marchant, oh well it can be pretty simple
20:25:06 <smooge> baby-steps
20:25:24 <smooge> we currently have nothing beyond "configs didn't bard on reload".
20:25:54 <marchant> so a test that involved taking down things in stg to verify proper alerting would be sufficient?
20:25:55 <smooge> so we need to go over what 8 things we want to test and just have that done
20:26:04 <smooge> should be
20:26:21 <smooge> CodeBlock, does that make sense?
20:26:27 <CodeBlock> Yeah that's fine
20:26:34 <skvidal> hi folks
20:26:39 <CodeBlock> hey skvidal :)
20:26:42 * skvidal just got networking back
20:26:44 <skvidal> sorry for being out
20:27:07 <marchant> OK, I will work on a basic plan and perhaps email smooge with other questions?
20:27:15 <skvidal> marchant: ??
20:27:27 <CodeBlock> skvidal: didn't miss too much - talked about yubikey stuff, and now talking about nagios 3 upgrade
20:27:29 <smooge> sounds good
20:27:31 <skvidal> ah ha
20:27:32 <skvidal> okay
20:28:03 <smooge> marchant, then work with CodeBlock to see that those tests can be done and that we get notified.
20:28:15 <marchant> understood
20:28:18 <CodeBlock> skvidal: can confirm that the xmpp alerts work
20:28:29 <skvidal> yes
20:28:29 <smooge> then we just need to build a new noc01 on virthost02 and we be happy
20:28:30 <skvidal> yes I can
20:28:32 * skvidal glares :)
20:28:36 <CodeBlock> by the day that he woke up and thought that the entire world was ... yeah
20:28:50 * abadger1999 here now and reads back
20:29:32 <CodeBlock> skvidal: Oh come on - you love those days - waking up grabbing a cup of coffee, and crapping yourself when you see .. what ~2000 alerts? ;D
20:29:42 <skvidal> CodeBlock: that aren't TRUE!
20:29:53 <CodeBlock> :P
20:30:25 <CodeBlock> Alright - next topic?
20:30:43 <CodeBlock> #topic Old sysadmin member removal
20:31:21 <CodeBlock> So smooge sent out a list of proposed people to remove, who haven't touched their access within...a certain time limit, that I forget (60 days?)
20:31:36 <smooge> 60 days.
20:31:45 <smooge> I need to remove the people from sysadmin-cvs
20:32:10 <smooge> but we should be ready to go
20:32:27 <CodeBlock> When are we looking at doing that - post freeze I'm assuming.
20:32:46 <smooge> post freeze
20:33:08 <ricky> Did you get my note that tibbs|h and some other cvsadmins should stay?
20:33:10 <smooge> I will update after freeze to make sure I didn't miss something, redo and igure out how to do a mass mailing
20:33:10 <CodeBlock> (is there a gap between pre release freeze and real release freeze?)
20:33:38 * CodeBlock finds the SOP about freezes
20:33:46 <smooge> ricky, yes. I tried to say above I am going to not remove them
20:33:51 <smooge> but failed
20:34:00 <smooge> there will be several freezes
20:34:09 <ricky> Cool, thanks
20:34:35 <smooge> 1) freeze alpha (slushy) , 2) freeze beta (sort of solid), 3) freeze release (-40C)
20:34:51 <CodeBlock> smooge: how much time in between each?
20:35:33 * nirik arrives fasionably late.
20:35:54 <smooge> usually about 2-3 weeks. So March 22 we will start beta freeze. April 26th? we will start final freeze
20:36:22 <CodeBlock> ok
20:36:33 <CodeBlock> #topic sysadmin training?
20:36:52 <smooge> Ok I failed at this
20:36:52 <abadger1999> CodeBlock: The freezes start two weeks before each release (alpha, beta, final)
20:37:06 <CodeBlock> abadger1999: ok
20:37:07 <abadger1999> CodeBlock: The spacing in between just depends on when the releases are.
20:37:11 * nirik filed the tickets for the alpha release, BTW.
20:37:12 <dgilmore> 2 weeks before scheduled release
20:37:19 <smooge> nirik thanks
20:37:24 <abadger1999> nirik: Awesome!
20:37:25 <dgilmore> so if a release gets delayed the freeze is extended
20:37:35 <abadger1999> <nod>
20:38:12 * dgilmore is pushing on time very hard
20:38:13 <smooge> our goal this year. release on the same day as Ubuntu
20:38:24 <CodeBlock> haha
20:38:38 <smooge> with Magea, us and Ubuntu we will bring down the InterTubes
20:39:46 <smooge> anyway. I am hoping to get the writeup of what abadger1999 and skvidal have talked about for moving people from fi-newbie->fi-apprentice->fi-craftsman->fi-master->fi-pastmaster->fi-hiddenllamamaster
20:39:52 <smooge> or some such thing
20:39:56 <skvidal> ah
20:39:56 <skvidal> okay
20:40:01 <skvidal> so here's my whole nefarious plan
20:40:11 <skvidal> I have an fi-apprentice group made
20:40:25 <skvidal> I have not added it to the $fas_group yet b/c of the freeze
20:40:40 <skvidal> and then add the acl to the puppet repo
20:40:51 <skvidal> so those folks can clone
20:40:54 <skvidal> but not commit to the repo
20:41:44 <skvidal> that's it
20:41:45 <abadger1999> <nod>
20:41:50 <skvidal> I was just waiting for the freeze
20:41:55 <CodeBlock> ok
20:41:59 <skvidal> but I'm happy to do it now
20:42:02 <skvidal> if y'all are cool w/it
20:42:21 <CodeBlock> skvidal: is that all they get is puppet01, /git/puppet clone access?
20:42:22 <abadger1999> I don't know how the acls are setup -- they won't get access to the private repo, correct?
20:42:38 <goozbach> ok so very late to the party am I
20:42:55 <skvidal> CodeBlock: ssh access to hosts
20:42:57 <skvidal> abadger1999: no
20:43:01 <skvidal> abadger1999: sysadmin-main only
20:43:05 <abadger1999> Excellent
20:43:16 <CodeBlock> skvidal: which hosts?
20:43:21 <averi> skvidal, ssh access to *all* hosts?
20:43:29 <skvidal> not _quite_ all
20:43:41 <skvidal> I think ricky had a reasonable point
20:43:41 <CodeBlock> basically -noc + ro puppet?
20:43:53 <skvidal> and maybe not giving access to the xen/virthost boxes
20:43:58 <skvidal> but the rest is okay
20:44:13 <skvidal> anyone think that sounds bad?
20:44:15 <CodeBlock> skvidal: so basically.. -noc + ro puppet. :P
20:44:26 <skvidal> CodeBlock: fine, be that way! :)
20:44:39 <CodeBlock> ;)
20:44:50 <averi> skvidal, I would not give sudo right away to -noc and puppet01
20:45:01 <dgilmore> skvidal: id say no virt hosts, no builders but otherwise ok
20:45:02 <averi> skvidal, if you meant that with 'access'
20:45:24 <smooge> actually that sounds pretty much sysadmin-noc
20:45:30 <skvidal> dgilmore: a fair point - I agree about keeping people out of releng
20:45:33 <CodeBlock> dgilmore, skvidal: fasXX, dbXX?
20:45:37 <CodeBlock> signXX?
20:45:41 <CodeBlock> other high security boxes?
20:45:42 <skvidal> CodeBlock: sign == releng
20:45:44 <skvidal> cannot happen
20:45:54 <skvidal> smooge: here's my problem with sysadmin-noic
20:46:02 <smooge> no I am slow typing
20:46:05 <smooge> I see the diff
20:46:07 <skvidal> 1. that gives them access to modify the noc systems and that's just an issue
20:46:22 <skvidal> 2. I hate the idea that the first way in is to work on nagios - that just seems odd to me
20:47:01 <skvidal> that's really all
20:47:06 <smooge> we are looking at a subset of systems we feel are ok for starters: publictest, people, hosted?, collab?, smtp-mm?
20:47:17 <smooge> bastion and puppet
20:47:19 <abadger1999> CodeBlock: db02 (whichever db server has fas atm), If we're not keeping people off the builders/kojihub then I don't think db03 is a problem, db01 is probable not an issue if we give people access to the app servers.
20:47:26 <skvidal> sounds right to me
20:47:56 <smooge> we will be keeping people off of the builders/releng/fas/db/sign
20:48:16 <abadger1999> okay, then db03 can be included... I think db01 would still be okay.
20:48:21 <skvidal> okay
20:48:28 <abadger1999> but equally, no need for me to bikeshed :-)
20:48:36 <CodeBlock> smooge: which of those listed with they have sudo on? pt? what else
20:48:52 <smooge> pt would be it
20:48:54 <CodeBlock> ok
20:49:07 <skvidal> sudo on?
20:49:09 <smooge> if that.
20:49:09 <abadger1999> s/included/included in the keeping people off list/
20:49:11 <skvidal> why would they have sudo on them?
20:49:12 <skvidal> woah
20:49:17 <skvidal> the whole point of this is READ ONLY
20:49:21 <skvidal> sudo != READ ONLY
20:49:25 <abadger1999> skvidal: +1
20:49:32 <CodeBlock> ok, ok
20:50:27 <smooge> a couple of things for people to know. we have had a pretty small set of people on puppet. this has meant people have had poor permissions on various directories
20:50:34 * dgilmore is with skvidal nno sudo for you
20:51:03 * phuzion is here for the last couple of minutes
20:51:11 <skvidal> smooge: 'poor permissions'? - what does that mean?
20:51:18 <skvidal> smooge: we have some unprotected things?
20:51:21 * ricky has noticed over the months world-readable private clones every once in a while
20:51:22 <smooge> o+r on private
20:51:27 <ricky> I check and chmod/notify people every once in a while
20:51:41 <ricky> I also made sure that precautions for private repo are in all SOPs that tell you to clone it.
20:51:52 <skvidal> can we make the o+r a git hook?
20:52:06 <ricky> It wouldn't really work as a hook - it needs to be on clone
20:52:21 <ricky> I looked at whether git did any umask stuff a while back and found nothing, but it could have changed now.
20:52:23 <smooge> and that is usually via the umask a person has
20:53:37 <skvidal> okay
20:53:45 <skvidal> then a couple of options
20:54:07 <skvidal> 1. no access to puppet - just cloneable access to the public git tree
20:54:12 <smooge> We have been really good about sharing with go+r but that may not be a good idea with a larger tree
20:54:17 <CodeBlock_> apparently my weechat session felt like dying.
20:54:20 <skvidal> 2. we cron job the private repo with a mallet
20:54:54 <gholms> Apparently cron job is now a verb.
20:54:55 <smooge> well its not the /srv/git/private as much as ~smooge/private
20:55:06 <skvidal> smooge: like I said - with a mallet
20:55:20 <smooge> now my private hurts
20:55:30 <CodeBlock_> smooge: that sounds like a personal problem. :(
20:55:43 <smooge> ok anyway...
20:56:00 <skvidal> our time is coming to a close
20:56:05 <smooge> we also need to go through puppet and make sure its clean
20:56:07 <CodeBlock_> thanks for dying, weechat. You're awesome. I woke up this morning thinking "You know. I hope weechat dies today."
20:56:21 <averi> CodeBlock_, can we end up with my two items please?
20:56:23 <smooge> we have not published it before because it hasn't been 'checked'
20:56:36 <smooge> we are down to 4 minutes
20:57:00 <CodeBlock_> averi: have smooge #topic seeing as I'm currently fighting with weechat and can't anymore.
20:57:02 <smooge> so ricky I would like you and some others to go over /puppet/ and see what we can do
20:57:19 <smooge> averi, what are your topics and can they be dealt with in 2 minutes?
20:57:35 * skvidal is tired of this
20:57:37 <skvidal> before we go on
20:57:45 <skvidal> I'd like to suggest we stop having our meetings here
20:57:45 <averi> smooge, mostly what we decided to do with blogs and inactive lists on hosted
20:57:48 <skvidal> this is stupid
20:57:53 <skvidal> and I'm tired of being kicked out of this channel
20:58:00 <skvidal> and having to mess up our whole conversation/logging flow
20:58:06 <smooge> I agree
20:58:08 <CodeBlock> skvidal: +1
20:58:12 <abadger1999> <nod>
20:58:16 <gholms> I agree and I'm in the following meeting!
20:58:16 <skvidal> I'm going to look for another room
20:58:18 <skvidal> that's available
20:58:26 <abadger1999> We need to allocate 1.5-2 hours.
20:58:29 <skvidal> yes
20:58:29 <skvidal> we do
20:58:38 <CodeBlock> yeah
20:58:52 <gholms> I could ask about moving the cloud meeting to an earlier time if you want.
20:58:58 <abadger1999> we're just not as efficient as the late mmcgrath :-)
20:59:04 <ricky> 2 hour meetings - fun.
20:59:06 <smooge> skvidal, #fedora-csi i sopen
20:59:09 <ricky> I'm fine with #fedora-admin if you all are
20:59:21 <ricky> Not like much goes on in there during meetings anyway.
20:59:21 * sijis is ok with that
20:59:26 <wakko666> ricky: +1
20:59:31 <jds2001> ricky: +1
20:59:34 <skvidal> except when we get someone with random drivebys
20:59:40 <skvidal> and then we can't stay on task
20:59:40 <gholms> Will there still be logs?
20:59:43 <skvidal> or log the channel effectively
20:59:53 <abadger1999> gholms: There will zodbot's there too.
21:00:00 <smooge> I am moving to #fedora-csi
21:00:09 <smooge> #endmeeting