infrastructure
LOGS
18:00:24 <smooge> #startmeeting Infrastructure (2017-07-27)
18:00:24 <zodbot> Meeting started Thu Jul 27 18:00:24 2017 UTC.  The chair is smooge. Information about MeetBot at http://wiki.debian.org/MeetBot.
18:00:24 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
18:00:24 <zodbot> The meeting name has been set to 'infrastructure_(2017-07-27)'
18:00:24 <smooge> #meetingname infrastructure
18:00:24 <zodbot> The meeting name has been set to 'infrastructure'
18:00:24 <smooge> #topic aloha
18:00:24 <smooge> #chair smooge relrod nirik abadger1999 dgilmore threebean pingou puiterwijk pbrobinson
18:00:24 <zodbot> Current chairs: abadger1999 dgilmore nirik pbrobinson pingou puiterwijk relrod smooge threebean
18:00:27 <clime> hey
18:00:27 <smooge> Hello all
18:00:35 <marc84> hi everyone
18:01:06 <relrod> here
18:01:12 * doteast here
18:01:20 * pingou around but not really here
18:02:05 <nirik> morning
18:02:39 <smooge> #topic New folks introductions
18:02:47 <smooge> Hi do we have any new folks here?
18:02:51 * cverna waves
18:04:04 <smooge> ok looks like not
18:04:15 <smooge> #topic announcements and information
18:04:15 <smooge> #info PHX2 Colo Trip coming up, Aug 14-18
18:04:15 <smooge> #info Major outage planned for Aug14->18
18:04:15 <smooge> #info FLOCK at Cape Code  Aug29->Sep01
18:04:16 <smooge> #info Fedora F27 Rebuild
18:04:28 <smooge> Any other announcements from people?
18:04:51 <clime> it's Cape Cod.
18:04:51 * nirik has nothing off hand.
18:05:15 <smooge> clime, its in gobby you can fix it :)
18:05:27 <clime> ok
18:05:28 <nirik> ah, but while we are there it will cape code. ;) so much coding to do
18:05:49 <nirik> (yes, this is a joke, thats a typo)
18:06:20 <clime> fixed
18:06:33 <clime> :)
18:06:45 <smooge> thanks :). I was leaving it that way until someone did so because I found my mistype funny
18:07:04 <bowlofeggs> .hello bowlofeggs
18:07:04 <smooge> nirik, how is the rebuild going?
18:07:04 <zodbot> bowlofeggs: bowlofeggs 'Randy Barlow' <randy@electronsweatshop.com>
18:07:11 <bowlofeggs> .hello jcline
18:07:12 <zodbot> bowlofeggs: jcline 'Jeremy Cline' <jeremy@jcline.org>
18:07:18 <smooge> I haven't seen much on it this round
18:07:57 <nirik> it's in the 'r's commiting
18:08:29 <nirik> probibly finish that later today... then just needs to finish up all the builds.
18:08:35 <smooge> I saw a couple of weird errors on lists like arm32 running out of memory for no reason and some ppc items.. do we need much of a re-rebuild?
18:08:37 <nirik> probibly take another day or so
18:08:50 <smooge> ok cool
18:09:01 <nirik> the pcc one we might rebuild failed things again for
18:09:06 <nirik> (once it's fixed)
18:09:13 <smooge> there was a pagure pkgs thing also this week. But I don't know the details on it
18:09:32 <pingou> it's waiting on mass-rebuild to be done
18:10:01 <pingou> but I just run into what is I think a bug in pkgdb, which is going to be required to be fixed for the migration to work properly
18:10:07 * pingou will investigate
18:10:28 <nirik> https://bugzilla.redhat.com/show_bug.cgi?id=1475636 is the ppc binutils bug
18:11:38 <smooge> ok thanks
18:11:54 <smooge> ok next up
18:11:57 <smooge> #topic (2017-07-27) Service Level Expectations (SLE)
18:11:57 <smooge> #info What are SLE's?
18:11:57 <smooge> #info Why do we need them?
18:11:57 <smooge> #info Who sets them?
18:11:57 <smooge> #info How are they followed?
18:11:58 <smooge> #info Where do they affect things?
18:12:00 <smooge> #info When do we put them in place?
18:12:02 <smooge> #info https://pagure.io/fedora-infrastructure/issue/6140
18:12:04 <smooge> #info https://fedoraproject.org/wiki/Infrastructure/ServiceLevelExpectations
18:12:06 <smooge> #info https://confluence.cornell.edu/display/itsmp/Service+Level+Expectations
18:12:18 <nirik> yeah, not too much discussion on list on this...
18:12:22 <smooge> Hi nirik I put in questions just to frame a possible discussion
18:12:25 <nirik> I guess I will try and work on it some more.
18:12:36 <nirik> I can answer some of those questions. :)
18:13:03 <nirik> SLE is a service level expectation... it's letting users/consumers of a service know what kind of service to expect.
18:13:55 <nirik> ie, if the service was down at 3am, would people wake to fix it? would it be fixed the next morning? if on a weekend or holiday would it be fixed the next business day?
18:14:24 <nirik> we can also use these with other projects we work with...
18:14:53 <nirik> ie, say we use centos ci and something is broken on our side of that, what expectations do they have for us to fix things when, etc.
18:15:14 <nirik> I set out some broad outline on the wiki page using domains...
18:15:25 <nirik> but there's more to fill in, which I can work on this coming week
18:15:41 <nirik> we will need/want to redo our status app to reflect this
18:16:03 <nirik> and possibly outage pages from apps, etc.
18:16:27 <nirik> any further thoughts on this?
18:17:50 <smooge> so we could key items to nagios alerts
18:18:03 <nirik> that also.
18:18:28 <nirik> have all continue to go to irc... but pages only for some things at some time periods
18:18:29 <smooge> with 24x7x52 vs 8x5x50 vs meh you got the bits didnt ya
18:19:21 <smooge> yeah agreed
18:19:41 <nirik> so there would definitely be some work to adjust to this, but in the end I hope it would be nicer for our users to know and us to not have to treat everything as urgent all the time
18:21:35 <nirik> we might also at the same time work on the consistent app naming setup...
18:21:36 <smooge> agreed there too
18:21:48 <smooge> now lets not go overboard
18:22:04 <nirik> https://pagure.io/fedora-infrastructure/issue/5644 is that ticket
18:22:42 <doteast> sorry, SLE no SLA right?
18:22:49 <smooge> correct
18:23:08 <smooge> an agreement requires two entities
18:23:15 <nirik> and money and signing things
18:23:21 <doteast> I see ;)
18:23:34 <nirik> There's no such relationship between us and our community...
18:23:45 * doteast kind like the E more than the A
18:24:14 <doteast> even in outside of community context
18:24:27 <nirik> if we fail to meet expectations, then thats something we would revist when it happened... and depending on why adjust the expectation or add more resources or something
18:24:44 <doteast> sweet
18:24:54 <smooge> Knowing some in our community.. we will always fail their expectations.
18:24:57 <smooge> and that is ok
18:25:19 <doteast> ah, you can win them all, even in business settings I guess
18:25:33 <nirik> sure... but these are our expectations.
18:25:58 <nirik> if we say app X will be addressed in 4 hours, but we don't... we need to adjust that or make sure we notify people harder or whatever.
18:26:20 <smooge> ah that goes to #info Who sets them?
18:26:21 * doteast nods
18:26:24 <nirik> probibly when apps are in their RFE process we can ask what expectations should be...
18:27:00 <mizdebsk> you mean RFR?
18:27:04 <nirik> "this is a fedoracommunity app we are just playing around with" -> no monitoring, whoever runs it responsible, no SLE at all.
18:27:11 <nirik> yeah, RFR, sorry. ;)
18:27:20 <doteast> request for resources
18:27:49 <nirik> "this is mirror lists for end users" -> monitored, will be acked in 15min, will be fixed asap, all hands on deck until it's working again.
18:28:47 <clime> .thisdoesntwork would be a good domain for apps in RFR ;)
18:28:59 <bowlofeggs> can we get a 1 minute SLE on puiterwijk? ☺
18:29:04 <smooge> no
18:29:07 <bowlofeggs> haha
18:29:29 <smooge> he requires same reverse SLE
18:29:32 <bowlofeggs> jcline pointed out that first we would need monitoring on puiterwijk
18:29:46 <smooge> and then monitoring on us
18:29:54 <nirik> since he's not here we can assign him that. ;)
18:30:04 <nirik> who monitors the monitors?
18:30:05 <bowlofeggs> but who monitors the monitoring?
18:30:07 <bowlofeggs> hahaha
18:30:07 <smooge> so much more on this?
18:30:25 <smooge> or time to move onto clime's topic?
18:30:28 <bowlofeggs> i'm a +1 to definiing clear SLEs
18:30:45 * nirik has nothing else on this unless theres questions.
18:30:57 <bowlofeggs> cool far fetched idea: the page that shows when an app is down could also state that app's SLE
18:31:01 <nirik> I'll try and add to it and get another round of review
18:31:09 <bowlofeggs> (i think the proxy does that?)
18:31:14 <nirik> bowlofeggs: yeah, we would want to adjus our status page for that
18:31:25 <nirik> and yeah, there's a outage html page for down apps
18:31:27 <bowlofeggs> i think that might be slightly different
18:31:34 <smooge> #topic possible future Fedora Infrastructure support for COPR - clime
18:31:37 <bowlofeggs> yeah i mean the page you see when you go to bodhi
18:31:37 <nirik> yeah, sorry, but both should be adjusted.
18:32:15 <clime> right, so we will have COPR solely from Fedora packages soon
18:32:21 <clime> like this week
18:32:25 <clime> I hope
18:32:43 <bowlofeggs> clime: what does it mean for them to be solely from fedora packages?
18:32:56 <bowlofeggs> like, the install of copr itself?
18:33:01 <bowlofeggs> or it hosts only fedora packages?
18:33:02 <clime> bowlofeggs: currently we deploy COPR from @copr/copr
18:33:05 <bowlofeggs> ah i see
18:33:07 <bowlofeggs> cool
18:33:15 <clime> which is good for hotfixes
18:33:26 <nirik> cool.
18:33:37 <bowlofeggs> we have an infra repo that we can use for hotfixes like that too
18:33:43 <clime> but one of the conditions for getting support of FI was to install everything from Fedora
18:33:52 <clime> oh that would be cool
18:34:05 <clime> I would really like to have an option to hotfix things quickly :)
18:34:22 <clime> but anyway I am happy that we are ready
18:34:44 <nirik> the reason we have the koji tag is that the koji builds are immutable once done, etc.
18:35:17 <clime> right, so I would like to ask what next progress should be
18:35:31 <clime> but I guess I will need to discuss it with puiterwijk
18:35:39 <clime> once he is here
18:35:43 <nirik> so, looking back at the RFR ticket... did we ever finish discussing/deciding what parts we want of copr to be in cloud and what parts outside?
18:35:58 <clime> actually I would like to move to OpenShift
18:36:08 <nirik> hopefully our cloud will be improved a lot soon...
18:36:19 <clime> not sure if it is related or not but It would be cool
18:36:33 <clime> so dev instance first
18:36:34 <nirik> clime: so, builds are done in build containers (like images are done in openshift)?
18:36:49 <clime> yup
18:36:57 <clime> nirik: that would be cool
18:37:17 <nirik> I think that could be interesting... but not sure how much work it might be. it changes a lot of what copr does...
18:37:22 <clime> well I guess there will be some work included
18:38:23 <nirik> but that is just the builder part right? the frontend/backend/keyserver/distgit would all be pretty much the same?
18:38:38 <clime> nirik: yes, exactly
18:38:49 <clime> nirik: unless they would also need to run in pods
18:39:16 <nirik> well, we currently have no good story for persistent storage in openshift... and some of those need a lot of that
18:39:17 <clime> but that wouldn't normally affects things...
18:39:35 <clime> I see.
18:39:55 <clime> maybe I could cooperate on investigating this
18:40:04 <doteast> wasn't there an idea to go for glusterfs for that....
18:40:09 <nirik> all the options are kinda poor for us...
18:40:20 * doteast is hazy on those matters
18:40:22 <clime> but I no pretty much none so far
18:40:26 <clime> *know
18:40:28 <nirik> yeah, we could use glusterfs...
18:40:42 <nirik> but we have had problems with performance in the past....
18:41:11 <nirik> and folks who run openshift in production don't use it, so hard to say how effective it will be
18:41:49 <bowlofeggs> i'm excited about openshift coming to an infra near me
18:41:50 <nirik> we could use nfs, but it requires us to manually make volumes to be used
18:42:05 <nirik> anyhow...
18:42:09 <bowlofeggs> nfs can also have problems with some apps, depending on what they do
18:42:43 <nirik> so, in the next month or so we should get new cloud hardware and install RHOSP 10+ with HA.
18:43:05 <nirik> we can move copr as it is now to that and get it much better supported hopefully.
18:43:21 <clime> that would be cool
18:43:23 <nirik> I think a way to use openshift for builders at least (or the entire thing) would be pretty cool tho down the road
18:43:49 <clime> Can I talk about this to somebody?
18:44:00 <clime> relrod maybe
18:44:01 <nirik> which part? :)
18:44:09 <clime> builders in openshift
18:44:36 <clime> and the first one as well
18:44:37 <nirik> I think we are all feeling our way on openshift right now, perhaps the list would be good or just admin irc channel?
18:44:52 <smooge> I would say the list would be good to capture it
18:45:05 <clime> alright
18:45:07 <relrod> Yeah I'm not the best person to talk to about it just yet. Still learning the ropes.
18:46:04 <clime> That's everything I have got.
18:46:10 <mizdebsk> what about nearer future? once copr is installable from fedora rpms, can at least the frontend be made "supported" service (so that other apps can depend on it)?
18:46:17 <nirik> https://docs.openshift.com/container-platform/3.4/creating_images/custom.html#creating-images-custom may be of note
18:46:29 <nirik> mizdebsk: once it moves to the new openstack yes.
18:46:40 <nirik> the current one is not HA, has issues
18:47:07 <clime> it is working pretty well in the end
18:47:11 <mizdebsk> i meant having frontend on kvm virthosts, with backend/builders still in cloud
18:47:12 <clime> but yeah
18:47:55 <clime> could be better for sure
18:48:02 <nirik> we could do that. (see above where I mention we haven't decided that yet)
18:48:31 <nirik> I think frontend would be easy to move out if we wanted.
18:48:51 <nirik> backend makes more sense to stay in cloud IMHO.
18:49:02 <smooge> OK we are coming up to the top of the hour and I would like to do the apprentice questions before we end
18:49:09 <nirik> (it has storage there, it also doesn't need public ips to talk to builders)
18:49:22 <nirik> so, anything else please bring up on list. ;)
18:49:29 <clime> okay
18:49:33 <clime> thank you.
18:49:58 <smooge> I think a plan to finish making this 'supported' will be part of the list thread
18:50:04 <smooge> ok next up
18:50:15 <smooge> #topic Apprentice Open office hours
18:50:24 <smooge> Hi any apprentices or new people with questions?
18:51:04 <bgray> no questions, just a note - please let me/us know if there is more to help out with.
18:51:12 <ole88> I'm new to both the project and the OS recently
18:51:19 <smooge> hello ole88
18:53:58 <smooge> as a new person, are you interested in sysadmin or coding?
18:54:15 <smooge> bgray, yes.. we do need to put some more or clear up the easyfix
18:54:48 <bgray> smooge: thanks!
18:55:00 <smooge> bgray, have you looked at our docs?
18:55:21 <ole88> I am a sysadmin at work and we are moving toward RHEL, but run Windows
18:55:31 <smooge> those are probably the place we need the most work on these days even if it is a "hey this doc is from 1984.. does this still work?)
18:55:44 <smooge> cool ole88 that can be a big leap
18:55:47 <ole88> I am also learning python in my spare time
18:55:49 <bgray> smooge: i have. they are very good!
18:56:07 <ole88> I have prior experince (years ago) with c, c++ and cobol
18:56:11 <nirik> Theres a issue about updating our mirrormanager docs... that could be a nice one for someone new.
18:56:16 <smooge> bgray, well I am going to put in an easy fix ticket for them to get get updated/reviewed
18:56:16 <bgray> smooge: ah, so outdated in places?
18:56:36 <smooge> bgray, if the data at the top is over a year old they are probably outdated
18:56:39 <ole88> I also use PowerShell and dotnetcore on my Fedora laptop
18:56:45 <bgray> smooge: :)
18:57:04 <smooge> ole88, that is a level of work I have not been able to get working yet.
18:57:48 <ole88> I can help out there
18:57:51 <smooge> cool
18:57:58 <smooge> well we are at the top of the hour
18:58:07 <smooge> #topic Open Floor
18:58:19 <smooge> I would like to thank eveyrone for coming to the meeting today
18:58:26 <marc84> thanks smooge
18:58:36 <nirik> thanks everyone
18:58:36 <relrod> thanks for running it, smooge!
18:58:39 <clime> thank you, smooge
18:58:48 <smooge> we will have another meeting in a week and we will also have a mailing list for people to talk on
18:58:51 <smooge> see you later
18:58:54 <smooge> #endmeeting