infrastructure
LOGS
15:00:05 <nirik> #startmeeting Infrastructure (2019-12-12)
15:00:05 <zodbot> Meeting started Thu Dec 12 15:00:05 2019 UTC.
15:00:05 <zodbot> This meeting is logged and archived in a public location.
15:00:05 <zodbot> The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:05 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
15:00:05 <zodbot> The meeting name has been set to 'infrastructure_(2019-12-12)'
15:00:05 <nirik> #meetingname infrastructure
15:00:05 <zodbot> The meeting name has been set to 'infrastructure'
15:00:05 <nirik> #chair nirik pingou relrod smooge tflink cverna mizdebsk mkonecny abompard bowlofeggs
15:00:05 <zodbot> Current chairs: abompard bowlofeggs cverna mizdebsk mkonecny nirik pingou relrod smooge tflink
15:00:05 <nirik> #info Agenda is at: https://board.net/p/fedora-infra
15:00:05 <nirik> #topic aloha
15:00:52 * pingou waves
15:01:23 * nirik will wait a min or two for folks to come in and the coffee to take hold.
15:01:30 <mkonecny> .hello zlopez
15:01:30 <zodbot> mkonecny: zlopez 'Michal Konečný' <michal.konecny@packetseekers.eu>
15:01:44 <smooge> hrllo
15:02:01 <ks3> hello
15:02:11 <nils> .hello nphilipp
15:02:11 <zodbot> nils: nphilipp 'Nils Philippsen' <nphilipp@redhat.com>
15:02:21 * jlanda waves
15:02:58 <nirik> ok, lets go ahead then...
15:03:10 <nirik> #topic Next chair
15:03:10 <nirik> #info magic eight ball says:
15:03:10 <nirik> #info    2019-12-05 -smooge
15:03:10 <nirik> #info    2019-12-12 -nirik
15:03:10 <nirik> #info    2019-12-19 -mizdebsk
15:03:12 <nirik> #info    2019-12-26 - NO MEETING
15:03:13 <nirik> #info    2020-01-02 - NO MEETING
15:03:15 <nirik> #info    2020-01-09 - ??
15:03:26 <nirik> does anyone want to take thw 2020-01-09 meeting? ;)
15:03:46 <dustymabe> .hello2
15:03:47 <zodbot> dustymabe: dustymabe 'Dusty Mabe' <dusty@dustymabe.com>
15:03:50 <nirik> it's next year tho, so thats a long time away. :)
15:04:40 <nirik> ok, we should likely figure it by next week.
15:04:56 <nirik> #topic New folks introductions
15:04:56 <nirik> #info This is a place where people who are interested in Fedora Infrastructure can introduce themselves
15:04:56 <nirik> #info Getting Started Guide: https://fedoraproject.org/wiki/Infrastructure/GettingStarted
15:05:08 <nirik> any new folks today that would like to give a quick introduction?
15:06:13 <nirik> ok, moving along then...
15:06:17 <nirik> #topic announcements and information
15:06:17 <nirik> #info No meetings coming up due to various holidays. Many Fedora people will be on time-off til parts of january
15:06:17 <nirik> #info work in progress in improving the script syncing assignee and CC from dist-git to bugzilla
15:06:17 <nirik> #info work done on improving the error messages sent by fedscm-admin when processing scm requests
15:06:18 <nirik> #info builder reinstalls/uprgades to f31 continuing. Should be done before holiday break - kevin
15:06:19 <nirik> #info ops folks are trying a 30min ticket triage every day at 19UTC in #fedora-admin
15:06:23 <nirik> any other announcements or info?
15:06:50 <nirik> #info kevin on PTO after friday until 2020-01-06. ;)
15:06:56 <pingou> the distgit-bz-sync script is running in production, but not doing anything there yet
15:07:07 <pingou> quick check on its announced changes, it looks fine
15:07:19 <nirik> cool. when would it go live?
15:07:29 <pingou> that's what I was about to ask :)
15:07:59 <pingou> #info bodhi is misbehaving for a couple of days, we're still not sure what's going on and are trying to figure it out
15:08:29 <nirik> sooner the better, IMHO. ;)
15:08:29 <pingou> any preference for the distgit-bz script?
15:08:35 <pingou> early next week?
15:08:52 <pingou> I'd love to have bodhi fixed before, 1 fire at a time sounds better :)
15:09:00 <nirik> yeah, makes sense.
15:09:20 <pingou> let's aim for next week Monday or Tuesday
15:09:30 <pingou> and otherwise I guess it'll wait for 2020 :)
15:09:30 <nirik> sounds good.
15:10:12 <nirik> ok, if no more announcements, moving on...
15:10:18 <nirik> #topic Oncall
15:10:18 <nirik> #info https://fedoraproject.org/wiki/Infrastructure/Oncall
15:10:18 <nirik> #info smooge is on call 2019-11-28->2019-12-05
15:10:18 <nirik> #info kevin is oncall 2019-12-05 -> 2019-12-12
15:10:18 <nirik> #info cverna is oncall 2019-12-12 -> 2019-12-19
15:10:19 <nirik> #info NOONE is oncall 2019-12-19 -> 2020-01-02
15:10:21 <nirik> #info ??? is oncall 2019-01-02 -> 2019-01-09
15:10:53 <nirik> #info Summary of last week: (from current oncall )
15:10:56 <pingou> if no one picks it up, I guess we'll have two weeks with NOONE
15:10:58 <pingou> .fasinfo noone
15:11:01 <zodbot> pingou: User "noone" doesn't exist
15:11:13 <pingou> we'll just need to fix that first though ^ :)
15:11:23 <nirik> yeah, I think that might be best. Note that I will be around and watching tickets and alerts... just not sitting on irc.
15:11:48 <smooge> we did NO-ONE last year and for when we all were in IE
15:11:49 <nirik> so, this last week there were a number of oncall pings, many of them outside my time tho...
15:12:06 <nirik> but nothing too different from normal.
15:13:04 <jlanda> .fasinfo someone
15:13:05 <zodbot> jlanda: User: someone, Name: None, email: somehowpossible@gmail.com, Creation: 2009-01-14, IRC Nick: None, Timezone: None, Locale: None, GPG key ID: None, Status: inactive
15:13:08 <zodbot> jlanda: Approved Groups: cla_fedora
15:13:10 <smooge> #info smooge on PTO after 2019-12-17 -> 2020-01-06
15:13:25 <jlanda> s/noone/someone and fixed pingou :P
15:13:30 <pingou> jlanda: perfect!!
15:13:49 <nirik> hheh. we don't know that is tho, do we?
15:14:00 <nirik> #topic Monitoring discussion [nirik]
15:14:00 <nirik> #info https://nagios.fedoraproject.org/nagios
15:14:00 <nirik> #info Go over existing out items and fix
15:14:26 <nirik> lets see... 3 down hosts are osbs aarch64 vms
15:14:31 <nirik> likely firewall rules.
15:15:29 <nirik> proxy01.stg disk space keeps flopping around.
15:15:45 <nirik> I think perhaps stuff is syncing without hardlinking, but needs someone to investigate.
15:16:21 <nirik> odcs has 2 alerts, and I opened a ticket for them, but nothing fixed yet
15:16:57 <nirik> fedora planet is not emitting fedmsgs... I have no idea why. Would love someone to track that down
15:17:34 <nirik> and qa09 is alerting on a raid check because it has python3 for /usr/bin/python and the raid check is python2.
15:18:05 <smooge> i was going to fix that last one but got sidelined with an audit
15:18:20 <pingou> nirik: we can put this on the CFT board, do you have an infra ticket for it already?
15:18:23 <smooge> or whatever you call "what hardware do you have and what is moving?"
15:18:51 <nirik> pingou: which one? ;)
15:19:28 <nirik> smooge: we may be able to change that plugin to call python2 instead of python and keep it working everywhere.
15:19:35 <pingou> nirik: the planet
15:19:59 <nirik> I don't, but I can file one.
15:20:17 <pingou> I'll link to it from the CFT ticket
15:22:02 <nirik> https://pagure.io/fedora-infrastructure/issue/8462
15:22:12 <pingou> thanks
15:22:22 <pingou> https://teams.fedoraproject.org/project/community-fire-team/us/33?kanban-status=357
15:22:29 <nirik> #topic Tickets discussion [nirik]
15:22:29 <nirik> #info https://pagure.io/fedora-infrastructure/report/Meetings%20ticket
15:22:29 <nirik> #link https://pagure.io/fedora-infrastructure/issue/7456
15:22:39 <nirik> .ticket 7456
15:22:40 <zodbot> nirik: Issue #7456: erlang group in src.fpo - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/7456
15:23:26 <nirik> I like the 3rd way there...
15:23:43 <nirik> but I guess we should hear from bowlofeggs
15:24:26 <nirik> #info Fedora CoreOS Team requests [dustymabe]
15:24:26 <nirik> #link https://hackmd.io/5uB7hOJKSjGUt65iLgPnbA#Existing-requests-for-Fedora-Infra
15:24:30 <pingou> I was wondering that
15:24:32 <nirik> dustymabe: anything to bring up today?
15:25:58 <nirik> This is from last week, but we can see if there's any update:
15:26:00 <nirik> #topic ResultsDB Ownership Going Forward
15:26:00 <nirik> #link https://pagure.io/fedora-infrastructure/issue/8415
15:26:13 <nirik> pingou: you were going to talk to some folks? had a chance yet/
15:26:17 <pingou> yes
15:26:26 <pingou> so Josef is still ok to keep an eye on the code
15:26:44 <pingou> I've approached the Factory2 and the CI folks
15:27:07 <pingou> the outcome was basically that we need to wait for next year to investigate this further
15:27:14 <nirik> ok.
15:27:28 <pingou> I'll keep tabs on it
15:27:34 <nirik> sounds good.
15:27:39 <pingou> but for the moment no-one can step up for this
15:28:33 <nirik> yeah, everyones plates are a bit full.
15:29:48 <nirik> pingou: did you want to give an update from firefighting team? there's a section for it on the board...
15:30:02 <smooge> ok changed all the /usr/bin/env python to /usr/bin/python2 in our nagios_client scripts since all of them are using 'print " ' and other 2isms.
15:30:22 <smooge> qa09 should be report disks soon
15:30:34 <nirik> smooge: that may also mess up tho for any that don't have python2 installed. ;)
15:30:36 <smooge> and any other f31 system with nagios
15:30:38 <nirik> but we can see.
15:30:42 <pingou> nirik: let's skip for this week, I forgot to prepare something :(
15:31:01 <smooge> nirik, if they didn't have python2 installed the scripts didn't work
15:31:14 <smooge> and I checked on el6 and python2 is python
15:31:16 <nirik> some of them may have been python3 compatible
15:31:26 <smooge> all of them had print "foo"
15:31:30 <nirik> not the raid one for sure tho
15:31:46 <smooge> which is what is breaking the raid one
15:32:25 <nirik> I guess we should move all of them to python3, but we should also move all the rhel7 to rhel8, which is gonna take a bit.
15:32:53 <nirik> anyhow.
15:32:58 <nirik> #topic Open Floor
15:33:24 <smooge> I tried changing the print " to print(" ") and found that they also had a bunch of other 2ism in split, dict and other lines
15:33:26 <nirik> anyone have anything else? comments, concerns, favorite christmas cookie?
15:33:28 <pingou> So bodhi has been misbehaving for a couple of days now
15:33:36 <pingou> it just hangs
15:33:52 <pingou> and after 30 seconds, openshift kills the process (leading to a 504 returning)
15:33:54 <nirik> pingou: yeah. ;( And that wasn't right after a upgrade right?
15:34:03 <pingou> nirik: not really
15:34:30 <pingou> we first thought the consumers were the issue, but we no longer so sure
15:34:49 <pingou> so we've scaled down the web pod to 1
15:35:02 <pingou> that's the latest/current attempt to see if that helps
15:36:06 <nirik> it wouldn't be anything with the cluster would it? I guess everything else is behaving ok?
15:36:51 <pingou> afawct
15:37:21 <pingou> we saw the number of locks on the db increase to above 100
15:38:39 <nirik> huh, weird.
15:39:51 <nirik> well, I guess we keep looking. :) Thanks for digging into it!
15:40:00 <nirik> anything else? if not will close out in a few.
15:41:00 <pingou> fingers crossed
15:41:00 <nirik> ok, thanks for coming everyone! Enjoy the extra 20min of your life back.
15:41:04 <nirik> #endmeeting