infrastructure
LOGS
15:00:18 <siddharthvipul> #startmeeting Infrastructure (2020-05-28)
15:00:18 <zodbot> Meeting started Thu May 28 15:00:18 2020 UTC.
15:00:18 <zodbot> This meeting is logged and archived in a public location.
15:00:18 <zodbot> The chair is siddharthvipul. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:18 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
15:00:18 <zodbot> The meeting name has been set to 'infrastructure_(2020-05-28)'
15:00:19 <siddharthvipul> #meetingname infrastructure
15:00:19 <zodbot> The meeting name has been set to 'infrastructure'
15:00:19 <siddharthvipul> #chair nirik pingou smooge cverna mizdebsk mkonecny abompard siddharthvipul
15:00:19 <siddharthvipul> #info Agenda is at: https://board.net/p/fedora-infra
15:00:19 <siddharthvipul> #info About our team: https://docs.fedoraproject.org/en-US/cpe/
15:00:19 <zodbot> Current chairs: abompard cverna mizdebsk mkonecny nirik pingou siddharthvipul smooge
15:00:19 <siddharthvipul> #topic aloha
15:00:35 <nirik> morning.
15:00:47 <siddharthvipul> nirik: Good morning :D
15:01:05 <siddharthvipul> let's see who is around today
15:01:17 <mobrien[m]> .hello mobrien
15:01:18 <zodbot> mobrien[m]: mobrien 'Mark O'Brien' <markobri@redhat.com>
15:01:26 <siddharthvipul> .hello siddharthvipul1
15:01:31 <zodbot> siddharthvipul: siddharthvipul1 'Vipul Siddharth' <siddharthvipul1@gmail.com>
15:01:56 <austinpowered> .hello austinpowered
15:01:57 <zodbot> austinpowered: austinpowered 'T.C. Williams' <fedoraproject@wootenwilliams.com>
15:02:01 <mkonecny> .hello zlopez
15:02:02 <zodbot> mkonecny: zlopez 'Michal Konečný' <michal.konecny@packetseekers.eu>
15:02:09 <siddharthvipul> my IRC time and system time are out of sync.. hmm. Will fix it post meeting
15:03:43 <siddharthvipul> #topic New folks introductions
15:03:54 <siddharthvipul> alright! anyone new today :)
15:04:11 <siddharthvipul> #info This is a place where people who are interested in Fedora Infrastructure can introduce themselves
15:04:11 <siddharthvipul> #info Getting Started Guide: https://fedoraproject.org/wiki/Infrastructure/GettingStarted
15:04:23 <pingou> .hello2
15:04:24 <zodbot> pingou: pingou 'Pierre-YvesChibon' <pingou@pingoured.fr>
15:05:21 <siddharthvipul> looks like we just have experienced people today (and me) :)
15:05:42 <siddharthvipul> #topic Next chair
15:05:42 <siddharthvipul> #info magic eight ball says:
15:05:42 <siddharthvipul> #info 2020-05-28 - siddharthvipul
15:05:42 <siddharthvipul> #info 2020-06-04 - mkonecny
15:05:42 <siddharthvipul> #info 2020-06-11 - siddharthvipul
15:05:53 <siddharthvipul> Any Volunteers for 2020-06-18?
15:06:00 <smooge> not me
15:07:04 <siddharthvipul> smooge: haha, I was volunteering you but you saved yourself right on time :P
15:07:50 * nirik suspects he will still be fixing problems from the dc move then...
15:08:30 <siddharthvipul> so if no one volunteers today, we can decide it in next meeting (since we are already running 2 weeks ahead and I can always volunteer in next one).. someone else might want to chair
15:08:39 <siddharthvipul> is it alright?
15:08:44 <cverna> +1
15:08:52 <pingou> +1
15:08:55 <nirik> sure
15:09:06 <siddharthvipul> awesome, moving ahead then
15:09:19 <siddharthvipul> #topic announcements and information
15:09:19 <siddharthvipul> #info CPE Sustaining EU-hours team has standups on Tuesday and Thursday at 1400 UTC in #fedora-admin - please join
15:09:19 <siddharthvipul> #info CPE Sustaining NA-hours team has a Monday through Friday 30 minute meeting going through tickets at 1800 UTC in #fedora-admin
15:09:19 <siddharthvipul> #info Fedora Infrastructure will be moving in 2020-06 from its Phoenix Az datacenter to one near Herndon Va. A lot of planning will be involved on this. Please watch out for announcements on changes.
15:09:21 <siddharthvipul> #info Fedora Communishift move has started but will take longer than expected. Current estimate for bringing back into production is TBD
15:09:31 <siddharthvipul> anyone has anything else to announce?
15:10:07 <siddharthvipul> we will wait for a couple of minutes for folks to fetch links (if they want to share something)
15:10:13 <smooge> #info Cverna will be moving to CoreOS
15:10:27 <mkonecny> #info the-new-hotness 0.13.1 is out, deployed to staging
15:10:58 <nirik> #info please help test things in iad2, see doc in infrastructure list
15:12:04 <siddharthvipul> congratulations cverna :)
15:12:25 <siddharthvipul> moving ahead now
15:12:28 <siddharthvipul> #topic Oncall
15:12:28 <siddharthvipul> #info https://fedoraproject.org/wiki/Infrastructure/Oncall
15:12:43 <siddharthvipul> #info mboddu was oncall 2020-05-21 -> 2020-05-28
15:12:49 <siddharthvipul> any volunteer for oncall 2020-05-28 -> 2020-06-04? and for 2020-06-04 -> 2020-06-11?
15:12:53 <smooge> NOT IT
15:13:08 <nirik> I think I should take 06-04-06-11
15:13:18 <nirik> or wait...
15:13:20 <siddharthvipul> noted
15:13:28 <siddharthvipul> nirik: waiting :)
15:13:28 <nirik> misread. I thought that was after the move.
15:13:33 <nirik> I'd take the one after that.
15:13:37 <cverna> I can take this week
15:13:41 <siddharthvipul> nirik: sure
15:13:44 <siddharthvipul> cverna: thank you
15:13:44 <nirik> For the move we might want to do something special
15:13:45 <smooge> i figure I will be sort of on-call for the move weeks but not really
15:13:53 <siddharthvipul> anyone after cverna?
15:14:04 <nirik> like have oncall say we are moving, please file a ticket and we will get to it when we can.
15:14:35 <cverna> .oncalltakeeu
15:14:35 <zodbot> cverna: Kneel before zod!
15:15:02 <nirik> but I guess it could be nice to have someone tell people that and reassure them and not irq people trying to fix things.
15:15:04 <siddharthvipul> #info cverna is oncall for 2020-05-28 -> 2020-06-04
15:15:44 <siddharthvipul> since I don't see anyone here, I will take the week after cverna
15:15:56 <cverna> mobrien do you want to shadow me during this week oncall duties ?
15:16:08 <siddharthvipul> let's wait then
15:16:15 <siddharthvipul> what if mobrien[m] is interested :)
15:16:20 <cverna> I ll ping you when I get pinged and we can look at stuff together
15:16:27 <nirik> we could always sign up mboddu, since he's not here. ;)
15:16:33 <Saffronique> hehe
15:16:34 <mobrien[m]> cverna:  Ya thats sounds good. Thanks
15:16:37 <nirik> cverna: +1, good idea
15:16:49 <siddharthvipul> hahaha, fair.. he did skip one week when he couldn't do takeoncall :P
15:17:13 <siddharthvipul> okay, so just to make it official, anyone taking the week after?
15:17:37 <mboddu> I am here
15:17:49 <mboddu> nirik: What are you throwing me at?
15:17:55 <siddharthvipul> mboddu: will you like to take oncall duty for next week?
15:18:00 <mboddu> Sure
15:18:02 <siddharthvipul> s/will/would
15:18:04 <nirik> ha. :)
15:18:06 <mboddu> nirik: :P
15:18:11 <siddharthvipul> #info mboddu is oncall for 2020-06-04 -> 2020-06-11
15:18:26 <siddharthvipul> #info Summary of last week: (from current oncall )
15:18:39 <siddharthvipul> mboddu: time to shine, go ahead :)
15:18:41 <mboddu> Nothing much other than the yesterday's fire
15:18:46 <pingou> we could also throw tomas under that bus :)
15:19:12 <siddharthvipul> mboddu: was docs fire related?
15:19:14 <mboddu> But I didn't do much yesterday, it was all nirik pingou and smooge
15:19:19 <mboddu> Yes
15:19:44 <nirik> and cverna and abompard
15:19:47 <pingou> mboddu: the goal of oncall is to identify if the ping is worth a ticket or interupting someone
15:19:58 <siddharthvipul> just a group of super amazing people *.*
15:20:00 * pingou didn't do much yesterday :(
15:20:25 <mboddu> Yeah, forgot about cverna and abompard
15:20:41 <mboddu> I guess I forgot cverna intentionally :P
15:20:43 <cverna> one thing to remember is if everything is slow or broken it is most likely rabbitmq
15:20:46 <cverna> :)
15:20:50 <siddharthvipul> cverna: hahaha
15:21:10 <pingou> it's the second time this year
15:21:18 <nirik> I really wonder if we couldn't improve the koji fedora-messaging plugin for rabbitmq being down
15:21:20 <smooge> honestly I have no idea how our rabbitmq ever worekd
15:21:20 <pingou> we may want/need to document howto fix that though :(
15:21:41 <nirik> yeah, but note we are moving to the much newer version...
15:21:43 <mboddu> Seems like bodhi is still having issues with creating updates? https://status.fedoraproject.org/
15:21:52 <pingou> nirik: short of it having itself a queueing mechanism, it'll be a little hard
15:21:58 <mkonecny> nirik: There are issues with it?
15:22:27 <nirik> pingou: well, it could timeout/error and continue gracefully. right now, it just hangs and koji builds get in all kinds of bad states.
15:22:39 <nirik> like the build worked, but tagging it at the end failed.
15:22:50 <nirik> or the build failed but the parent task is still running
15:22:54 <cverna> mboddu I think it is better but I have not updated status yet
15:23:08 <pingou> if only we had a dev environment where we could test koji's behavior more easily :-p
15:23:15 <mboddu> cverna: Ah okay, thanks
15:23:17 <nirik> cverna: I think it's probibly ok for now... as we continue to investigate?
15:23:39 <nirik> pingou: if only we had less files. They cause all the problems
15:24:07 <cverna> Yeah
15:24:35 <pingou> nirik: not more than one open file at a time!
15:24:54 <pingou> we should know that by now ^^
15:25:04 <nirik> I hope we can setup a nicer staging after the move, but it turns out to be surprisingly hard to do. ;(
15:25:42 <siddharthvipul> fire fire fire.. if anyone of you think I can help, let me know.. I am around on weekends as well (if it's needed)
15:26:09 <pingou> siddharthvipul: watch out, we may call you on that!
15:26:09 <siddharthvipul> nirik: seems this needs a longer decision and planning.. are we ready to move to next topic for now?
15:26:11 <mboddu> Same here, I am happy to help if needed even during weekends
15:26:17 <siddharthvipul> pingou: I will be counting you on that :)
15:26:44 <nirik> sure, we can move on. I appreciate help... we are going to likely need it the week of the 8th. ;)
15:27:00 <siddharthvipul> nirik: oh, that's the week I am away
15:27:01 <siddharthvipul> LOL
15:27:02 <siddharthvipul> jk
15:27:16 <siddharthvipul> #topic Monitoring discussion [nirik]
15:27:25 <siddharthvipul> nirik: please do your think :)
15:27:32 <nirik> lets take a look here...
15:27:58 <nirik> we have 3 hosts down. 2 of them are taskotron hosts that are no more... I guess we need to just run the noc playbook?
15:28:14 <nirik> the other is a aarch64 box that had problems and I rebooted it and it never came back up. ;)
15:28:24 <nirik> we can try power cycling it.
15:28:42 <pingou> there is still one taskotron related PR pending for our ansible repo
15:28:52 <mboddu> siddharthvipul: "do your thing" reminded me of https://www.youtube.com/watch?v=ojhTu9aAa_Y
15:29:11 <nirik> pingou: yeah, I think tflink may have already done that one via direct push, but I am not sure.
15:29:21 <nirik> proxy05 seems out/low on disk space again...
15:29:43 <nirik> pagure01 and torrent02 also
15:30:08 * mboddu will take a look at torrent02
15:30:16 <mboddu> But I did clean up for f33 release
15:30:19 <nirik> mboddu: can you nuke the f30 images if they are still there?
15:30:20 <mboddu> Maybe I missed something
15:30:31 <nirik> since we are after eol
15:30:42 <mboddu> nirik: Yes, I didn't clean up f30 yet and sure, I will take care of it
15:30:44 <nirik> pingou: you can look at pagure01? is that just normal usage? or ?
15:31:24 <smooge> proxy05 is at 87% versus 100%
15:31:25 <nirik> thats about it, some low swap...
15:31:29 <pingou> nirik: I'll check disk space on pagure01
15:31:44 <siddharthvipul> #action mboddu to clean f30 images from torrents
15:31:44 <nirik> we can probibly move along unless anyone sees anything else.
15:31:45 <smooge> currently docs takes up a lot of space
15:31:51 <nirik> smooge: yeah. ;(
15:32:02 <siddharthvipul> #action pingou to investigate disc space on pagure01
15:32:17 <smooge> but hey we got that report that proxy05 had an open http port .. so maybe we can look elsewhere
15:32:36 <nirik> smooge: I think that was one of the amazon ones...
15:33:00 <nirik> but I could have misread. In any case it was a stupid report. :)
15:33:11 <smooge> no that was how i read it also
15:33:42 <nirik> "Your server is serving things!!!!!! OMG QTF BBQ"
15:34:00 <siddharthvipul> :)
15:34:04 <siddharthvipul> next topic?
15:34:20 <nirik> yep
15:34:23 <siddharthvipul> #topic Data-Center Move update
15:34:35 <siddharthvipul> smooge: nirik, hope it's not *that* bad :)
15:34:49 <nirik> hopefully not. ;)
15:34:56 <nirik> so, lets see:
15:35:22 <nirik> We have a few more bare metal machines to install/configure... power8/9 aarch64 and qa virthosts.
15:35:46 <nirik> There's lots of various breakage I was going to fix yesterday, but got sucked into fires. ;(
15:36:14 <nirik> my plan is to keep working on checking/fixing things today. Possibly get new x86_64 builders installed.
15:36:22 <nirik> smooge is working on the baremetal installs.
15:36:42 <nirik> Tomorrow I think we will do a mass reboot... and make sure things all come back up ok.
15:37:12 <nirik> I also hope to finish working on the plan for the migration week.
15:37:17 <nirik> what moves when, etc...
15:37:31 <pingou> nirik: we don't have builders yet, right?
15:37:32 <nirik> next week is testing testing, fixing anything we can see that we can fix.
15:37:53 <nirik> then it's move week. ;) I'll buy some deathwish coffee next store trip. ;)
15:38:20 <nirik> pingou: nope, but I have all I need now (they setup a iscsi volume for buildvm-01-32), so I just need to install some
15:38:41 <nirik> we need to take the staging koji adjustmet sql script and re-work it for the migration
15:39:00 <nirik> if someone wants to take that on, that would be great (I can file a ticket)
15:39:50 <nirik> migration week is gonna be fun. It's hard to split things up... so a few days are just going to be really long.
15:39:59 <nirik> any questions?
15:40:00 <pingou> nirik: is that the one we use to sync to stg?
15:40:10 <pingou> if so I've adjusted it not long ago for the latest sync
15:40:29 <pingou> but it may need some adjustments still as this is prod -> prod vs prod -> stg
15:40:35 <nirik> pingou: yeah... but it might need some adjustments for iad2 instead of stg
15:40:38 <nirik> yeah
15:40:51 <nirik> I was just going to make all new builders.
15:41:39 <nirik> but it adjusts things like krb for people, and it should adjust them to iad2 not stg
15:42:17 <nirik> hopefully puiterwijk will have our iad2 ipa working soon.
15:43:10 <pingou> hopefully it won't be too much adjustments
15:44:34 <siddharthvipul> there are a lot of moving pieces it seems.. :) let's move to openfloor and see if someone has something to discuss :)
15:44:49 <siddharthvipul> #topic Open Floor
15:45:19 <nirik> https://pagure.io/fedora-infrastructure/issue/8960
15:45:34 * tflink will look at nagios again, thought he fixed it already :-/
15:46:12 <nirik> tflink: ah, it's still in inventory/backups
15:46:25 * jednorozec would like to become infra padawan
15:46:41 <pingou> welcome young padawn jednorozec
15:46:45 <siddharthvipul> jednorozec: same here *nods head* same here
15:46:50 <mboddu> I cleaned up torrent02.fp.o
15:46:57 <tflink> nirik: cool, I'll get it fixed when I do the HW machines
15:47:08 <pingou> siddharthvipul: come on, you're already an old padawan
15:47:27 <siddharthvipul> pingou: haha, let's say.. a "useful padawan"
15:47:34 <siddharthvipul> or "more useful" :P
15:47:49 * mboddu is still a padawan too
15:48:21 <siddharthvipul> mboddu: sigh, you all keep raising the bar and you will scare me off :P
15:48:23 <nirik> siddharthvipul / jednorozec: sure! happy to add you, I thought we already had a long time ago? or do you mean more than the apprentice group?
15:48:40 * tflink will also deal with the rest of kparal's PR for the taskotron stuff - was just waiting for everything to be gone
15:48:48 <siddharthvipul> nirik: I am not in apprentice program if it helps
15:49:22 <siddharthvipul> but I would like to shadow someone and learn more (I did cverna once but that week was awfully quiet) (I am not sad about that)
15:49:40 <siddharthvipul> s/apprentice program/apprentice group
15:50:22 <jednorozec> nirik, not sure I may have a lot of rights to do stuff, but would really appreciate someone guiding me through some ticket.
15:50:23 <nirik> siddharthvipul: your account is siddharthvipul right?
15:50:31 <siddharthvipul> nirik: siddharthvipul1
15:50:50 <siddharthvipul> apologies about that last 1.. sigh! I wish, I wish we could change it easily
15:51:30 <siddharthvipul> nirik: thank youuu
15:51:37 <nirik> added you both.
15:52:08 <nirik> cool. Yeah, after dc move hopefully we can have time to start cross training...
15:52:19 <jednorozec> awesome
15:52:23 <siddharthvipul> nirik: looking forward to that :)
15:52:36 * nirik wants to give mobrien[m] more fun things to do too...
15:52:39 <nirik> :)
15:53:02 <siddharthvipul> Do we have any other topic/openfloorworthy dicussions?
15:53:17 <siddharthvipul> I will wait for a few more minutes and then close the meetings if there is  nothing :)
15:53:22 <mobrien[m]> nirik mobrien is looking forward to the fun :)
15:53:55 <siddharthvipul> mobrien[m]: uh ohh.. it was a warning I guess :P
15:55:00 <nirik> and if you all want to hang out in #fedora-noc I am sure we will have dc move stuff going on all the time now. ;)
15:55:14 <siddharthvipul> 2 minutes more before meeting ends.
15:55:26 <mobrien[m]> maybe after a lots of training we can help enough to let nirik and smooge work only 12 hour days
15:55:35 * nirik plays 2 minutes to midnight
15:55:39 <siddharthvipul> nirik: I am there but whenever I open it, it's a lot of logs and I get lost immediately
15:55:52 <nirik> mobrien[m]: I look forward to it!
15:56:11 <siddharthvipul> mobrien[m]: long shot but I can see that's possible in a couple of years
15:56:41 <siddharthvipul> 1 minute
15:56:58 <mobrien[m]> untill there is another dc move next year....
15:57:15 <siddharthvipul> Thank you everyone for coming, it was a good meeting :)
15:57:23 <siddharthvipul> #endmeeting