infrastructure
LOGS
17:59:23 <nirik> #startmeeting Infrastructure (2016-12-15)
17:59:23 <zodbot> Meeting started Thu Dec 15 17:59:23 2016 UTC.  The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot.
17:59:23 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
17:59:23 <zodbot> The meeting name has been set to 'infrastructure_(2016-12-15)'
17:59:23 <nirik> #meetingname infrastructure
17:59:23 <zodbot> The meeting name has been set to 'infrastructure'
17:59:23 <nirik> #topic aloha
17:59:23 <nirik> #chair smooge relrod nirik abadger1999 lmacken dgilmore threebean pingou puiterwijk pbrobinson
17:59:23 <zodbot> Current chairs: abadger1999 dgilmore lmacken nirik pbrobinson pingou puiterwijk relrod smooge threebean
17:59:23 <nirik> #topic New folks introductions
17:59:49 <trishnag> .hello trishnag
17:59:50 <zodbot> trishnag: trishnag 'Trishna Guha' <trishnaguha17@gmail.com>
17:59:57 <jcline> .hello jcline
17:59:58 <zodbot> jcline: jcline 'Jeremy Cline' <jeremy@jcline.org>
18:00:20 <puiterwijk> .hello puiterwijk
18:00:21 <zodbot> puiterwijk: puiterwijk 'Patrick "マルタインアンドレアス" Uiterwijk' <puiterwijk@redhat.com>
18:00:24 <smooge> .hello smooge
18:00:25 <zodbot> smooge: smooge 'Stephen J Smoogen' <smooge@gmail.com>
18:00:26 <marc84> hi everyone
18:00:28 <nirik> ggrr
18:00:30 <nirik> hexchat went all wonkey on me
18:00:32 <nirik> sorry for the early meeting fire. ;)
18:00:52 * pingou here
18:01:52 <triguy> Hello everyone
18:01:54 <nirik> so, any new folks like to give a short one line introduction?
18:03:14 <triguy> New here. Long time Linux user 15+ years. Been in System Administration/Engineering for a looong time. Looking to help out where I can and give a little back.
18:03:18 <nirik> I saw 3 new intros on the list. :)
18:03:39 <nirik> welcome triguy. Are you intested more in sysadmin stuff then? or application development? or both?
18:04:37 <triguy> Admin stuff would be easiest for me to break ground on. But I've done some work in php and perl so whatever you have for me I'm game
18:05:22 <nirik> cool. Most of our application development stuff is in python, but we do have a assortment. ;)
18:05:38 <nirik> anyhow, see me after the meeting in #fedora-admin and I can add you to our apprentice group and get you started.
18:06:29 <triguy> Sounds great
18:06:31 <nirik> any other new folks? Harrison Brock or Murali Krish around? I'll try and go back and answer everyone's intro emails when I get a chance.
18:07:31 <nirik> alright, lets move on to info / status then
18:07:45 <nirik> #topic announcements and information
18:07:46 <nirik> #info PHX2 visit completed - patrick / kevin
18:07:46 <nirik> #info qa09 rebuilt - kevin
18:07:46 <nirik> #info ipv6 added to osuosl systems - smooge/kevin
18:07:46 <nirik> #info lots of discussion about cert pinning plans - kevin
18:07:47 <nirik> #info releng flag day completed - everyone
18:07:48 <nirik> #info Feedback welcome on the future of the-new-hotness: https://github.com/fedora-infra/the-new-hotness/issues/145 - jeremy
18:07:57 <nirik> anything anyone wants to expand on / discuss there? or add to?
18:08:57 <nirik> ok, moving on then...
18:08:59 <nirik> #topic Recap of work done in PHX2
18:09:14 <nirik> Last week puiterwijk and I were out at our main datacenter (phx2) near phoenix.
18:10:01 <nirik> we found a number of things alerting we didn't know about, and got almost all of them fixed before we left.
18:10:29 <nirik> we got a new rack emptied out and ready for us for next year.
18:10:33 <pingou> things we could have detected before/
18:10:37 <pingou> ?
18:11:03 <nirik> pingou: yeah. We were thinking we might get all our mgmt stuff logging to our log01 syslog...
18:11:10 <nirik> then it should be more visible.
18:11:22 <nirik> it was things like failed power supplies, fans, etc.
18:11:27 <pingou> hm :/
18:11:42 <pingou> this might actually fit into my thoughts on hardware.py
18:11:57 <nirik> well, I am not sure this info is available from the host...
18:12:02 <smooge> it isn't
18:12:08 <pingou> ah :(
18:12:13 <smooge> unless you install no-source binaries
18:12:37 <smooge> or it used to be that was what you needed
18:12:46 <nirik> we might also be able to do something like SNMP, but thats worse than syslog IMHO
18:13:13 <smooge> SNMPv2 zone rootme
18:14:10 <nirik> we also: moved the s390 virthost and koji and db over to the s390 vlan, tested all the PDU (power units) and put in a ticket to reset and re-ip some of them, upgraded the firmware in our opengear serial devices so we could actually connect to it with a modern browser
18:14:48 <nirik> anyhow, that was the highlights...
18:15:01 <nirik> any questions on that?
18:15:26 * pingou has none
18:15:59 <nirik> #topic flag day retrospective
18:16:16 <nirik> so, right after we were out at the datacenter all week... we had a releng "flag day"
18:16:29 <nirik> https://fedoraproject.org/wiki/ReleaseEngineering/FlagDay2016
18:16:54 <nirik> I think things are all settling down from it now, but I had a few suggestions if we ever do this again...
18:17:00 <smooge> next year I would like we have a flag day every month.. so we can have FlagDay201701
18:17:21 <nirik> We really really really should try and get needed packages all out and stable. People got really confused where some packages were, etc...
18:17:49 <nirik> rhel6/rhel7 is still not fully sorted out. We are missing some updates entirely for rhel6 and rhel7 still.
18:18:16 <pingou> I remember Patrick pushing packages earlier
18:18:22 <pingou> then I heard they were pulled back
18:18:33 <nirik> also we should try and be better about scheduling. We picked this day a while back, but then had the datacenter work and puiterwijk was flying and such
18:18:39 <pingou> I didn't quite understand what happened or why, so +1 to pkgs in stable
18:19:10 <smooge> I also feel we should be able to move these days if needed
18:19:14 <pingou> also the move got started before you folks were actually awake, so in the morning (EU time) there were already some questions
18:19:21 <nirik> in the f25 case I think koji and fedorapackager were in testing, then dgilmore unpushed them to push them to stable, but that made them disappear from testing for a while
18:19:38 <puiterwijk> pingou: well, the move started at 5PM my time, so I was awake. But by morning CET I was indeed asleep
18:19:47 <nirik> pingou: yeah, we made the changes at 00:00UTC (sunday night in the usa)
18:19:49 <pingou> I was expecting this more in line of the release
18:20:15 <nirik> #info should make sure as many packages as possible are already stable
18:20:26 <nirik> #info should schedule date better
18:20:38 <nirik> #info should say exactly what time we plan to make changes to make that clear
18:20:49 <pingou> Patrick++ for the quick SOP on debugging
18:21:02 <pingou> I've used it, not sure if other did, but at least that worked :)
18:21:22 <nirik> yeah. I also wrote some wiki pages, but it's not clear many people read them. ;(
18:21:28 <puiterwijk> I also wrote some tools to ease debugging, but I think I didn't announce that enough :)
18:22:13 <nirik> ok. Thats all I had on this... anyone have anything more?
18:22:56 <nirik> ok, lets move on then...
18:22:58 <puiterwijk> I'll be glad when I finally get to focus on other things again :)
18:23:08 <nirik> #topic Holidays
18:23:08 <nirik> #info Red Hat has shutdown days of 2016-12-23 -> 2017-01-02
18:23:08 <nirik> #info Red Hat employees are on break and may only be online in short periods
18:23:08 <nirik> #info Ebeneezer Smooge says Bah Humbug to you all and all a good night.
18:23:59 <nirik> so, FYI, I am on time off next week, the week after and the monday after (back jan 3rd). That said, I will almost surely be around... just don't expect me. :)
18:24:28 <nirik> if anyone is planning on working on things, do remember to supress nagios alerts so they don't call in someone who would rather not be around.
18:24:43 <smooge> I will be around except for Dec 25->27. I will be Krampus
18:25:00 <puiterwijk> I will most likely be around some times, but at least have my pager near me.
18:25:35 <nirik> so, short rule: if you want someone, feel free to try on IRC, but don't be surprised if they aren't around and move back to email. ;)
18:26:04 <pingou> I'm only taking of: 27 -> 2
18:26:06 <nirik> and I hope everyone has a lovely holiday season. :)
18:26:10 <puiterwijk> And if critical, you can page people (or ask others to page them for you if you don't have the address)  (nagios does miss stuff sometimes)
18:26:13 <pingou> the very same to you :)
18:26:49 <nirik> oh, and everyone should remember to put in stuff on the fedocal vacation calendar... https://apps.fedoraproject.org/calendar/vacation/
18:27:10 <nirik> ok, anything else on this?
18:27:55 <nirik> #topic Apprentice Open office hours
18:28:09 <nirik> any apprentices have questions, comments? or looking for things to work on ?
18:29:19 <nirik> ok then. :)
18:29:19 <triguy> None. For me. Figure I'll just poke around once I have access and learn the environment.
18:29:44 <nirik> triguy: yep. Also always feel free to ask questions in #fedora-admin, there's usually someone around who can help out.
18:30:10 <triguy> will do. Thanks
18:30:15 <nirik> I was thinking about asking if we should discuss/change meeting format again... but I think I might do that on the list instead.
18:30:29 <nirik> puiterwijk: did you want to do a short thing on kerberos? or wait?
18:30:34 <puiterwijk> nirik: sure.
18:30:40 <puiterwijk> Just added it to gobby :-)
18:30:44 <nirik> also, I will not be around next week, so someone else will need to run the meeting if we have one. ;)
18:30:52 <nirik> #topic Learn about: Kerberos at Fedora - puiterwijk
18:30:55 <nirik> take it away puiterwijk
18:31:10 <puiterwijk> Okay, so I figured I'd give another short overview of how we have setup kerberos.
18:31:19 <puiterwijk> I have already explained this a couple of times, but this time on the books
18:31:44 <puiterwijk> So, as everyone is probably aware, we are using the Fedora Account System (FAS) at Fedora for managing our accounts
18:32:36 <puiterwijk> For Kerberos however, we have setup a new IPA setup (freeipa.org). As soon as a user logs in to anything that takes their FAS password for the first time after that was enabled, their account gets created in the IPA setup.
18:33:12 <puiterwijk> The reason it waits until the first login is because we need to provide the users' password to IPA so it can rehash it to its own format, and we don't store passwords in a reversable way.
18:33:50 <pingou> how do we pass it on to IPA?
18:33:52 <nirik> puiterwijk: the link between fas-> ipa is what? https post? or fas calls a script?
18:33:52 <smooge> except for the text file: yahoo.txt
18:33:53 <puiterwijk> Admins can actually see this happening: in FAS, there is a new field in the users' profile "IPA Sync status", that will indicate whether or not the account was ever logged in and if yes if their IPA account creation was succesful
18:34:18 <nirik> smooge: better xz it now, yahoo just announced another billion or so. :(
18:34:24 <puiterwijk> nirik, pingou: fas uses the IPA API (funny word!) over HTTPS to create the account
18:34:42 <pingou> ok
18:34:46 <nirik> great
18:35:18 <puiterwijk> This does use a pinned certificate that's only used at the IPA instances, so the chance of this getting MITM'd is quite small unless the attacker is already in the IPA instance.
18:35:57 <puiterwijk> Also, when a user updates their password, the change gets synced over the same API. If this change fails, FAS will revert the password change so that they will always remain in sync
18:36:12 <pingou> cool
18:36:38 <puiterwijk> After this is all synced, the user can immediately "kinit $username@FEDORAPROJECT.ORG" with the same password. Tickets are valid for 24 hours, and can be refreshed for up to 7 days
18:36:52 <puiterwijk> (where "refreshed" means to renew without typing the password again)
18:37:28 <pingou> does that mean FAS doesn't send the email about password change until IPA acked' the change?
18:37:28 <puiterwijk> Then we have certain services that will accept these kerberos tickets. These are currently: Koji, lookaside, and... Ipsilon
18:37:49 <puiterwijk> pingou: correct. The entire password change is in a transaction, and if the IPA part fails it'll get reverted and an exception thrown
18:38:02 <pingou> ok, cool
18:38:10 <puiterwijk> So the user will see a "We failed to update your password". But checking the logs, other than my few tests, this has not happened yet
18:38:38 <nirik> sometimes the user will fail the initial sync... do we know the root cause of that?
18:38:58 <puiterwijk> nirik: that's some validation issues at the IPA API. I have not yet went to the root cause of that
18:39:07 <nirik> ok.
18:39:18 <puiterwijk> Basically it thinks that for some reason the username does not consist of only letters, even though they do because FAS forces them to
18:40:15 <puiterwijk> So, in production our IPA setup is a set of two nodes that replicate amongst eachother. They are both masters, and active at the same time
18:40:55 <puiterwijk> Then we have two different haproxy entries: one for plain kerberos, and one for the HTTPS Based proxy, so users will hit either of the IPA nodes, and even if one of them is temporarily offline the other one will continue.
18:42:12 <puiterwijk> We currently have fully autodiscoverably setup, but at some point I may tear that down so that everyone will need to have a manual configuration. The reason for this is that I really want everyone that can (RHEL7+ and Fedora24+) to use the HTTPS proxy, since that's slightly more secure and privacy-friendly
18:42:58 <puiterwijk> So, I think that that's some of the basics on our setup... Any questions so far?
18:43:48 <nirik> note: there is configuration in fedora-packager
18:44:01 <nirik> the fedora-packager package that is.
18:44:06 <puiterwijk> Ah, right.
18:44:16 <puiterwijk> That configuration will use the https:// proxy, and configure both prod and staging
18:45:53 <puiterwijk> Also note that at some point in the near future I intend to move all 2fa for Infra over to IPA, but I'll email the list when it comes to that
18:46:40 <pingou> any idea how much work it will be to port all this to FAS3?
18:47:32 <puiterwijk> That depends on how long it takes until we get FAS3. Since I am planning to start integrating group sync soon, so if FAS3 is nowhere near integrating IPA I might just do it in FAS2..
18:47:45 <mk779> Introduction : Hello I'm Murali Krish. I"m interested in DevOps and i want to contribute to Fedora Infrastructure. I've knowledge on Java/J2EE SQL Basics of python. Please help in learning the Skills and tools that would help me to a sucessfull contributor. Thanks
18:48:12 <pingou> puiterwijk: well, we do want to move forward w/ FAS3, so I'd rather we push forward with FAS3 than keep on adding feature to FAS2
18:48:28 <nirik> hey mk779. Welcome. :) See me after the meeting in #fedora-admin and I can get you added to our apprentice group to look around...
18:48:56 <puiterwijk> pingou: right, let's discuss this out of meeting, but as said it depends on the current estimates
18:49:14 <pingou> puiterwijk: which were sent to the list :)
18:49:41 <pingou> and depends on how many pings we can send to Smoother1rOgZ :)
18:49:51 <nirik> pretty agreesive plans as I recall, but good to get it done
18:50:02 <puiterwijk> pingou: right. But I was told that Smoother1rOgZ  would be doing the forward-porting of this code, so that'll be fine :-)
18:50:08 <mk779> Hey @nirik thanks for the response. That would be great.
18:51:27 <puiterwijk> Anyway, any other questions?
18:51:46 <nirik> puiterwijk: so when we move 2fa, everyone will need to re-enroll? or ? and that will mean kinit with 2f? or would still be sudo only?
18:52:57 <puiterwijk> nirik: I am on the fence for re-enrolling. I can migrate the TOTP-based 2fa tokens, but I will be unable to migrate yubikey. Also, we could have a step where IPA will accept the current system, but given the very limited set of people that use 2fa, I think it's easier to just cut over and have them re-enroll.
18:53:34 <nirik> I think re-enroll is ok as long as we have a clear time for that and communicate it...
18:54:03 <puiterwijk> Currently, after you setup 2fa, you can kinit with or without 2fa. And we can require a 2fa'd kinit for login to certain services (e.g. ssh for bastion/batcave)
18:54:50 <puiterwijk> nirik: right. So with re-enrolling I would enable enrolling for current system AND IPA at the same time for a week, and then cut the prod systems over sometime after that that will be communicated
18:55:07 <nirik> so at this same time we would drop fas_client and go to kerberos for ssh and sudo and everything?
18:55:26 <puiterwijk> No, that still needs FAS to sync its groups to IPA.
18:55:41 <puiterwijk> So either I add that to FAS2, or we wait for FAS3.
18:56:55 <nirik> ok, well, we can discuss out of meeting. Might be nice to make a kerberos roadmap wiki page or something...
18:57:07 <puiterwijk> Sure
18:57:15 <nirik> not having to run fas_client will be a big win for us...
18:57:41 <nirik> anyhow, thanks puiterwijk!
18:57:45 <nirik> #topic Open Floor
18:57:50 <nirik> anyone have anything for open floor?
18:58:01 <smooge> i am ready to call it a year
18:58:15 <pingou> smooge: only 2 weeks to go :)
18:58:17 <nirik> yeah, 2016 wasnt too great... hope 2017 is better
18:59:11 <smooge> or we could use ldap fro groups
18:59:18 <nirik> ok, if nothing more will close out in a random amount of time.
18:59:25 <puiterwijk> smooge: that is on the far roadmap for me
18:59:36 <smooge> .fire smooge
18:59:36 <zodbot> adamw fires smooge
19:00:00 <nirik> .chair smooge
19:00:00 <zodbot> smooge is seated in a chair with a nice view of a placid lake, unsuspecting that another chair is about to be slammed into them.
19:00:00 <smooge> say good night gracie
19:00:56 <nirik> Thanks everyone
19:00:59 <nirik> #endmeeting