fedora_infrastructure_ops_daily_standup_meeting
LOGS
18:00:30 <nirik> #startmeeting Fedora Infrastructure Ops Daily Standup Meeting
18:00:30 <zodbot> Meeting started Mon Aug  3 18:00:30 2020 UTC.
18:00:30 <zodbot> This meeting is logged and archived in a public location.
18:00:30 <zodbot> The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot.
18:00:30 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
18:00:30 <zodbot> The meeting name has been set to 'fedora_infrastructure_ops_daily_standup_meeting'
18:00:30 <nirik> #chair mboddu nirik smooge pingou
18:00:30 <zodbot> Current chairs: mboddu nirik pingou smooge
18:00:30 <nirik> #meetingname fedora_infrastructure_ops_daily_standup_meeting
18:00:30 <zodbot> The meeting name has been set to 'fedora_infrastructure_ops_daily_standup_meeting'
18:00:30 <nirik> #info meeting is 30 minutes MAX. At the end of 30, its stops
18:00:31 <nirik> #info agenda is at https://board.net/p/fedora-infra-daily
18:00:32 <nirik> #topic Tickets needing review
18:00:33 <nirik> #info https://pagure.io/fedora-infrastructure/issues?status=Open&priority=1
18:00:49 <smooge> HELLO VIET_ADMIN
18:00:58 <nirik> good morning
18:02:01 <smooge> ok tickets
18:02:08 <nirik> you want to mod tickets?
18:02:14 <smooge> ok will do so
18:02:20 <nirik> .ticket 9189
18:02:21 <zodbot> nirik: Issue #9189: Fedora docs not being built anymore - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/9189
18:02:28 <nirik> so, lets fix this now. ;)
18:02:51 <nirik> I am going to run:
18:03:12 <nirik> ansible -a 'pkill -9 rsync' proxies
18:03:26 <nirik> unless there's a more elegant way?
18:03:40 <fm-admin> pagure.issue.tag.added -- smooge tagged ticket fedora-infrastructure#9189: groomed, high-gain, and 2 others https://pagure.io/fedora-infrastructure/issue/9189
18:03:41 <fm-admin> pagure.issue.assigned.added -- smooge assigned ticket fedora-infrastructure#9189 to kevin https://pagure.io/fedora-infrastructure/issue/9189
18:03:42 <fm-admin> pagure.issue.edit -- smooge edited the priority fields of ticket fedora-infrastructure#9189 https://pagure.io/fedora-infrastructure/issue/9189
18:03:43 <fm-admin> pagure.issue.comment.added -- smooge commented on ticket fedora-infrastructure#9189: "Fedora docs not being built anymore" https://pagure.io/fedora-infrastructure/issue/9189#comment-668882
18:04:01 <smooge> nirik, sounds good. I don't know of a better way
18:04:31 * mboddu is here
18:04:58 <nirik> done and I am running a sync. I think we can close that ticket now? or do we want to wait until the sync finishes?
18:06:03 <smooge> lets wait until the sync finishes
18:06:17 <smooge> I don't like closing 'fixed' and finding out it wasn't
18:06:19 <nirik> .ticket 9091
18:06:20 <zodbot> nirik: Issue #9091: Very slow s390x builder - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/9091
18:06:26 <nirik> wait, thats wrong
18:06:34 <nirik> .ticket 9191
18:06:35 <zodbot> nirik: Issue #9191: ELN bodhi updates closing Fedora rawhide bugzillas - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/9191
18:06:45 <smooge> no its right.. just not the ticket you wanted :P
18:07:13 <nirik> IMHO, this should be refiled upstream... it's bodhi behavior that was just added, and it needs adjusting.
18:07:28 <nirik> which of course brings up... who is bodhi upstream anymore.
18:07:35 <smooge> we are
18:08:19 <nirik> anyhow, we could ask them to refile upstream or do so for them and close? or ?
18:09:28 <smooge> ok filing upstream
18:10:21 <nirik> I don't think that change was properly thought out... but perhaps it's just me
18:11:28 <mboddu> I asked the same question myself couple of days :D
18:11:35 <mboddu> days back*
18:12:10 <nirik> .ticket 9193
18:12:11 <zodbot> nirik: Issue #9193: Review-stats pod stuck - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/9193
18:12:18 <fm-admin> pagure.issue.edit -- smooge edited the close_status and status fields of ticket fedora-infrastructure#9191 https://pagure.io/fedora-infrastructure/issue/9191
18:12:19 <fm-admin> pagure.issue.tag.added -- smooge tagged ticket fedora-infrastructure#9191: bodhi https://pagure.io/fedora-infrastructure/issue/9191
18:12:20 <fm-admin> pagure.issue.comment.added -- smooge commented on ticket fedora-infrastructure#9191: "ELN bodhi updates closing Fedora rawhide bugzillas" https://pagure.io/fedora-infrastructure/issue/9191#comment-668885
18:12:25 <nirik> I can likewise do this one right now, it's just a oc delete away
18:12:54 <smooge> can you walk me through this.. my openshift knowledge is crap
18:13:05 <nirik> pod "review-stats-make-html-pages-1596013200-bvgct" deleted
18:13:09 <nirik> sure.
18:13:19 <nirik> login to os-master01.iad2 (or any os-master)
18:13:24 <nirik> oc project review-stats
18:13:27 <fm-admin> pagure.issue.assigned.added -- smooge assigned ticket fedora-infrastructure#9193 to smooge https://pagure.io/fedora-infrastructure/issue/9193
18:13:35 <nirik> (note that you can use tab for the project, it will complete it)
18:13:53 <nirik> oc delete pod/review-stats-make-html-pages-1596013200-bvgct
18:14:01 <smooge> Already on project "review-stats" on server "https://os-masters.iad2.fedoraproject.org:443".
18:14:19 <smooge> Error from server (NotFound): pods "review-stats-make-html-pages-1596013200-bvgct" not found
18:14:21 <nirik> yeah, it will be on the project the last person to run as root on there was/is
18:14:29 <smooge> oh you did this already
18:14:30 <nirik> right, because I already deleted it
18:14:43 <nirik> I did look at logs and the last thing in it was:
18:14:46 <nirik> 2020-07-29 09:00:22,025 review_stats.py INFO     Quering Bugzilla for the blockers list...
18:14:54 <nirik> so I guess it got stuck talking to bz
18:15:09 <fm-admin> pagure.issue.edit -- smooge edited the close_status and status fields of ticket fedora-infrastructure#9193 https://pagure.io/fedora-infrastructure/issue/9193
18:15:10 <fm-admin> pagure.issue.comment.added -- smooge commented on ticket fedora-infrastructure#9193: "Review-stats pod stuck" https://pagure.io/fedora-infrastructure/issue/9193#comment-668888
18:15:20 <mboddu> bz became huge pain lately, ftbfs are also failing randomly with 502's
18:15:20 <nirik> thats all in needs-review in infra
18:15:33 <nirik> mboddu: huh, I thought it was fixed now....
18:15:40 <fm-admin> pagure.issue.edit -- smooge edited the close_status and status fields of ticket fedora-infrastructure#9182 https://pagure.io/fedora-infrastructure/issue/9182
18:15:41 <fm-admin> pagure.issue.comment.added -- smooge commented on ticket fedora-infrastructure#9182: "Build hosts trying to connect to port 9940 to ci.centos.org and others" https://pagure.io/fedora-infrastructure/issue/9182#comment-668890
18:16:06 <nirik> oh, and BTW:  F32F-testing   :   2 updates (failed)
18:16:08 <mboddu> nirik: Sorta, its just random, ftbfs is filing few tickets and then 502, then resume the script, file some bugs and then 502...
18:16:09 <nirik> :(
18:16:20 <mboddu> nirik: I resumed it today morning
18:16:26 <mboddu> And thats another thing I want to discuss
18:16:34 <nirik> well, apparently it's broken. ;)
18:17:05 <nirik> fire away with releng stuff. ;)
18:17:28 <mboddu> There is a ticket filed by kalev about it(https://pagure.io/fedora-infrastructure/issue/9186) but I am not sure if its a duplicate of
18:17:33 <mboddu> https://pagure.io/fedora-infrastructure/issue/9177
18:18:02 <mboddu> Anyway, there is only 1 releng ticket
18:18:12 <nirik> yeah, it could well be. needs investigation.
18:18:15 <mboddu> .releng 9650
18:18:16 <zodbot> mboddu: Issue #9650: No logs attached to FTBFS bugzilla - releng - Pagure.io - https://pagure.io/releng/issue/9650
18:18:30 <mboddu> The logs are missing for s390x builds
18:18:45 <mboddu> "BuildrootError: Requested repo (1785390) is DELETED"
18:18:58 <nirik> so... let me try and explain.
18:19:28 <nirik> 1. the FTBFS bugzilla bug filier is attaching the logs from the FIRST mass rebuild pass. Not the second one.
18:19:48 <nirik> I don't know if that can easily be fixed.
18:20:12 <nirik> There's a number of s390x failures that happened at different times.
18:20:15 <mboddu> huh...
18:21:00 <nirik> a) The repo is DELETED thing. That was monday morning. I had set kojira to expire repos really quickly to remove space. I fixed it monday. It should only afftect some range of packages monday morning and shouldn't affect anything after that
18:21:50 <nirik> b) There was a problem with the varnish cache on wed night. It messed up a bunch of builds, it's fixed and shouldn't have affected any after
18:22:31 <nirik> c) The random transitory 'can't read header' or 'failed to download package' cache issues. Those still affected some builds even after both passes, but decathrope is resubmitting them.
18:23:37 <mboddu> Thanks nirik
18:24:02 <nirik> so, I'd say close that ticket... it's just the way koji works... sometimes if builds fail before it makes any logs... we don't have any logs to attache
18:24:08 <mboddu> I am looking at the ftbfs script, but there is no latest failed builds in the tag :(
18:24:09 <mboddu> https://pagure.io/releng/blob/master/f/scripts/find_failures.py#_67
18:25:08 <nirik> you could change the start date to the start date of the second pass?
18:25:40 <nirik> because all the ones that failed should have failed after that?
18:25:55 <mboddu> nirik: Yeah, but what about the failed one's from first pass?
18:26:17 <nirik> if they worked after that, we don't want to file them or treat them as failed do we?
18:26:40 <nirik> since we redid all the failed ones...
18:26:49 <nirik> if we only did some of them that wouldn't work, but we did all of them...
18:27:03 <mboddu> Yeah, but there might be failed one's from the first pass as well, if we change the date, the failed one's from first pass wont be picked
18:27:25 <mboddu> Oh right
18:27:28 <nirik> but if it's still failed we would pick them from the second pass
18:27:30 <mboddu> I understood, what you mean
18:27:39 <nirik> and if not... all's well
18:28:24 <mboddu> jednorozec: ^ can you change it if the file ftbfs failed again with bz 502?
18:28:50 <nirik> anything else from releng land?
18:28:56 <mboddu> Nope, thats all
18:29:24 <nirik> ok, a few infos:
18:29:49 <nirik> #info mass update/reboots next week: monday - staging, tuesday non outage causing stuff, wed - the rest
18:30:09 <nirik> mboddu: oh, do we need to do any prep work for branching? stuff we meant to fix last time?
18:30:46 <mboddu> nirik: Nope, it should be all good, but I will recheck, there is a rust change that is needed, iirc
18:31:08 <nirik> how long did it take last time?
18:31:39 <nirik> #info koji update / outage later this week
18:31:48 <nirik> (open to when... tomorrow? wed?)
18:33:38 <nirik> anyhow, I think thats all I had.
18:33:45 <darknao> nirik: docs.fp.o is up to date now :) thanks
18:33:51 <nirik> darknao: great!
18:34:19 <mboddu> nirik: How long as in mass rebuild?
18:34:40 <nirik> mboddu: no, how long to do branching?
18:35:16 <mboddu> nirik: Just a day + mass rebuild (branching) of modules which took couple of days
18:35:40 <nirik> ok
18:35:47 <fm-admin> pagure.issue.edit -- kevin edited the close_status and status fields of ticket fedora-infrastructure#9189 https://pagure.io/fedora-infrastructure/issue/9189
18:35:48 <fm-admin> pagure.issue.comment.added -- kevin commented on ticket fedora-infrastructure#9189: "Fedora docs not being built anymore" https://pagure.io/fedora-infrastructure/issue/9189#comment-668897
18:35:54 <nirik> #endmeeting