infrastructure
LOGS
16:02:58 <pingou> #startmeeting Infrastructure (2021-01-28)
16:02:58 <zodbot> Meeting started Thu Jan 28 16:02:58 2021 UTC.
16:02:58 <zodbot> This meeting is logged and archived in a public location.
16:02:58 <zodbot> The chair is pingou. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:02:58 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
16:02:58 <zodbot> The meeting name has been set to 'infrastructure_(2021-01-28)'
16:03:00 <pingou> #meetingname infrastructure
16:03:00 <zodbot> The meeting name has been set to 'infrastructure'
16:03:02 <pingou> #chair nirik pingou smooge cverna mizdebsk mkonecny abompard siddharthvipul mobrien
16:03:02 <zodbot> Current chairs: abompard cverna mizdebsk mkonecny mobrien nirik pingou siddharthvipul smooge
16:03:04 <pingou> #info Agenda is at: https://board.net/p/fedora-infra
16:03:06 <pingou> #info About our team: https://docs.fedoraproject.org/en-US/cpe/
16:03:08 <pingou> #topic aloha
16:03:10 <pingou> ó/
16:03:30 <nirik> morning
16:03:32 <Zlopez[m]> .hello zlopez
16:03:32 <zodbot> Zlopez[m]: zlopez 'Michal Konečný' <michal.konecny@psmail.xyz>
16:04:14 <austinpowered__> .hello2
16:04:15 <mobrien> .hello
16:04:15 <zodbot> austinpowered__: Sorry, but you don't exist
16:04:18 <zodbot> mobrien: (hello <an alias, 1 argument>) -- Alias for "hellomynameis $1".
16:04:23 <bodanel> hello
16:04:25 <mobrien> .hi
16:04:26 <zodbot> mobrien: mobrien 'Mark O'Brien' <markobri@redhat.com>
16:05:45 <pingou> let's move on
16:05:47 <pingou> #topic New folks introductions
16:05:49 <pingou> #info This is a place where people who are interested in Fedora Infrastructure can introduce themselves
16:05:51 <pingou> #info Getting Started Guide: https://fedoraproject.org/wiki/Infrastructure/GettingStarted
16:05:55 <pingou> any new folks who would like to introduce themselves?
16:06:55 <pingou> looks pretty quiet :)
16:06:58 <pingou> #topic Next chair
16:06:59 <pingou> #info magic eight ball says:
16:07:01 <pingou> #info chair 2021-02-04 - siddharthvipul
16:07:02 <darknao> .hi
16:07:03 <zodbot> darknao: darknao 'Francois Andrieu' <naolwen@gmail.com>
16:07:07 <pingou> so next week siddharthvipul will chair
16:07:14 <pingou> do we have a volunteer for Feb 11th?
16:07:32 <Zlopez[m]> I could take it
16:07:41 <pingou> it's your!
16:07:47 <pingou> #info chair 2021-02-11 - zlopez
16:07:47 <smooge> thanks Zlopez[m]
16:07:56 <pingou> any volunteer for the 18th?
16:07:59 <Zlopez[m]> Updating my TODO list :-)
16:08:01 <smooge> ok me
16:08:08 <pingou> #info chair 2021-02-18 - smooge
16:08:09 <smooge> Zlopez[m], I need a .org training session
16:08:12 <pingou> thanks smooge
16:08:22 <smooge> I am on call now I think right?
16:08:29 <pingou> it's me this week :)
16:08:40 <pingou> we can keep the 25th for next week :)
16:08:51 <pingou> let's move to annoucements
16:08:53 <smooge> I thought yours ended today and I started
16:08:55 <smooge> ok sorry
16:08:56 <pingou> #topic announcements and information
16:08:58 <pingou> #info CPE Infra&Releng EU-hours team has a Monday through Friday 30 minute meeting going through tickets at 1030 Europe/paris in #centos-meeting
16:09:00 <pingou> #info CPE Infra&Releng NA-hours team has a Monday through Friday 30 minute meeting going through tickets at 1800 UTC in #fedora-admin
16:09:02 <pingou> #info Datacenter move is over, but some items still need to be done: see https://fedoraproject.org/wiki/Infrastructure/2020-post-datacenter-move-known-issues
16:09:04 <pingou> #info Anitya (release-monitoring.org) 1.0.0 is now running in production
16:09:18 <pingou> any other announcement someone would like to make?
16:09:35 <nirik> #info mass rebuild is progressing along...
16:09:49 <pingou> any news on the AAA ?
16:10:13 <bodanel> what is AAA ?
16:10:17 <bodanel> newbie here
16:10:19 <pingou> the new account system
16:10:24 <bodanel> ok
16:10:30 <pingou> AAA stand for: Account, Authorization and Authentication
16:10:42 <dtometzki> hello together
16:11:18 <nirik> you could just say noggin too. But not sure it would be _less_ confusing. :)
16:11:26 <pingou> :)
16:11:36 <nirik> we have been working through ssh and 2fa issues in staging... making good progress I think
16:11:41 <mobrien> AAA is still not quite there but on the way :)
16:11:58 <pingou> #info the new AAA system is progressing along in stg, still some work to do though
16:12:07 * mboddu is here
16:12:08 <mobrien> There are a few loose ends to tie up with 2fa as well as finishing touches on 2 apps I think
16:12:10 <pingou> anything else to announce?
16:12:28 <bodanel> AAA short presentation could be the topic for next week learning section if hasnt been already
16:12:38 * pingou puts it on the list
16:13:04 <austinpowered> NExt week is set for IPA. Is that AAA is using?
16:13:06 <mobrien> I think it may be best to move it to the week after as it should be more fully formed at that stage
16:13:10 <nirik> we need to figure out src.stg.fp.o auth...
16:13:23 <mobrien> austinpowered, IPA is part of AAA
16:13:25 <pingou> austinpowered: it's part of it yes
16:14:10 <bodanel> could be moved next week, just wanted to add it to the agenda
16:14:14 <austinpowered> I know IPA can handle many roles. Looking forward to next week.
16:14:24 <pingou> it's on the ideas box
16:14:29 <pingou> ;-)
16:14:31 <bodanel> austinpowered: mee too
16:14:38 <nirik> me too!
16:14:46 <pingou> ok, let's move to oncall and we'll get to the learning topic at the end
16:14:48 <pingou> #topic Oncall
16:14:50 <pingou> #info https://fedoraproject.org/wiki/Infrastructure/Oncall
16:14:51 <nirik> oh wait, I'm giving that one... I better read up. :)
16:14:52 <pingou> #info smooge is oncall for 2021-01-28 to 2021-02-04
16:15:03 <pingou> anyone to take up oncall after smooge?
16:15:12 <pingou> Feb 4th to 11th
16:15:29 <dtometzki> what is to do onCall ?
16:15:36 <smooge> .takeoncallus
16:15:43 <smooge> .oncalltakeus
16:15:43 <zodbot> smooge: Error: You don't have the alias.add capability. If you think that you should have this capability, be sure that you are identified before trying again. The 'whoami' command can tell you if you're identified.
16:15:53 <pingou> the oncall person is the designed person to be interupted by questions on IRC
16:16:00 <nirik> I'm happy to take it if no one else...
16:16:05 <pingou> so other folks can remin focused on their work
16:16:34 <pingou> if the person being oncall cannot/don't know the answer to the issue, they simply ask that the issue be logged in a ticket
16:16:46 <pingou> nirik: looks like it'll be yours
16:16:52 <pingou> #info nirik is oncall for 2021-02-04 to 2021-02-11
16:16:59 <pingou> anyone for the week after?
16:17:12 <dtometzki> i will try it
16:17:17 <pingou> cool, thanks
16:17:23 <pingou> #info dtometzki is oncall for 2021-02-11 to 2021-02-18
16:17:27 <smooge> .oncalltakeus
16:17:27 <zodbot> smooge: Kneel before zod!
16:17:33 <pingou> #info Summary of last week: (from current oncall )
16:17:36 <pingou> so that was me
16:17:41 <pingou> I've got a few pings
16:17:59 <pingou> one on the releng side about an eln build running consistently out of space on koji
16:18:13 <pingou> I've sent them to the releng tracker where it got closed as a temp issue
16:18:29 <pingou> I've also helped a couple of people on IRC
16:18:47 <pingou> on having issue pushing to pagure.io over ssh, turned out it was a permission issue on their side on their .ssh folder
16:19:09 <pingou> and one had a wrong host on .ssh/config to bounce via bastion to an internal host
16:19:16 <pingou> iirc that's about it
16:19:29 <pingou> #topic Monitoring discussion [nirik]
16:19:31 <pingou> #info https://nagios.fedoraproject.org/nagios
16:19:33 <pingou> #info Go over existing out items and fix
16:19:39 <pingou> nirik: if you want to take it :)
16:19:43 <mobrien> dtometzki, if you have any questions when on call you can ping me directly.
16:19:51 <nirik> I don't think there's much change here from last week... let me check tho
16:20:15 <pingou> I've seen greenwave going above its threshold and then back down a few minutes after
16:20:19 <nirik> yeah, pretty much exactly the same as last week...
16:20:19 <dtometzki> mobrien: many thanks
16:20:27 <pingou> I wonder if something changed in greenwave that made it a little slower
16:20:46 <nirik> pingou: well, we added some openqa stuff... but not sure it would cause this
16:20:50 <pingou> so maybe we can ask greenwave's folks for this and potentially tweak the nagios threshold?
16:21:07 <Zlopez[m]> I noticed few warnings about the greenwave queue
16:21:10 <pingou> nirik: could be that as well if openqa sends more results
16:21:56 <nirik> sure, seems reasonable
16:22:02 <Zlopez[m]> And I fixed the rabbitmq queue for the-new-hotness
16:22:27 <Zlopez[m]> Found some bug after moving Anitya 1.0.0 to production
16:25:04 <nirik> anyhow, nothing else nagios wise...
16:25:19 <pingou> let's move to the learning topic of the day
16:25:25 <pingou> #topic Learning topic
16:25:27 <pingou> #info ansible setup [nirik/mobrien] on 2020-01-28
16:25:31 <pingou> #info IPA [nirik] on 2020-02-04
16:25:41 <pingou> mobrien: nirik: the floor is yours
16:25:47 <mobrien> I'll start
16:25:56 <mobrien> This is a relativley large topic so if I miss anything or there are any questions, please speak up as I go as I will be talking on this topic again so all feedback is welcome
16:26:13 <mobrien> Ansible is an automation tool which we use for app deployment and config as well as infrastructure as code
16:26:13 <mobrien> The repo is available here https://pagure.io/fedora-infra/ansible
16:26:16 * nirik is happy to answer questions/expand on anything.
16:26:34 <mobrien> With a few minor exceptions all our infra is controlled by ansible and the hosts can be seen in the inventory dir.
16:26:56 <mobrien> This directory is where most of the host setup lives, it contains the inventory files of all the hosts as well as group_vars and host_vars
16:27:14 <mobrien> The inventory uses an ini file format. The group_vars directly relate to the groups specified in the inventory files and the same for host vars
16:27:22 <mobrien> For example, if a host exists called test.fedoraproject.org in the test group in the inventory. The vars specified in the `host_vars/test.fedoraproject.org` and in `group_vars/test` will automatically apply to that host.
16:27:55 <mobrien> So if you have some variable confusion this is a good place to check
16:28:01 <bodanel> do you use ansible tower or AWX
16:28:02 <bodanel> ?
16:28:05 <mobrien> The precedence for these can be seen here https://docs.ansible.com/ansible/latest/user_guide/playbooks_variables.html#understanding-variable-precedence
16:28:15 <mobrien> bodanel, not at the moment
16:28:24 <mobrien> possibly in the future
16:28:51 <mobrien> Secret variables are held in a separate ansible repo which only members of the sysadmin-main group have.
16:29:05 <mobrien> This repo is hosted on the batcave and is explicitly referenced in the required playbooks.
16:29:21 <mobrien> All the playooks are in the playbooks dir, there are a few admin type playbooks in the base of this directory with most of the rest in subdirectories depending on type.
16:29:35 <mobrien> For example there is an openshift-apps directory, no prizes for guessing whats housed there.
16:30:04 <mobrien> All the roles are in the roles dir which is where the bulk of the work actually gets done, roles are often used by multiple different playbooks so must be written in as generic a way as possible and care taken to correct tagging.
16:30:42 <mobrien> tags can be specified at run time to only run certain tasks, hence their importance
16:30:57 <bodanel> what's the process of making a change: branch and pull request ?
16:31:01 <mobrien> There are a number of templates in a number of places in the repo which contain the bulk of our server configuration, usually in the related role. It is best to do a git grep if you are looking for something specific in these.
16:31:19 <mobrien> bodanel, o make PR's just create a fork to make changes and create a PR against the main branch.
16:31:25 <mobrien> s/o/to
16:31:57 <Zlopez[m]> bodanel: Here is the link to repository https://pagure.io/fedora-infra/ansible
16:32:02 <bodanel> thks
16:32:34 <ComputerKid[m]> Do you guys ever use ad-hoc ansible commands?
16:32:37 <mobrien> Ansible is the source of truth for our infra so a lot of questions about the infra can be answered here.
16:32:55 <nirik> yep. all the time. :)
16:33:32 <mobrien> ComputerKid[m], for something one off but generally like to keep everything in playbooks for idempotence
16:33:52 <ComputerKid[m]> That makes sence
16:33:58 <ComputerKid[m]> *sense
16:34:17 <mobrien> if it is to gether info ansible adhoc is the go to
16:34:28 <nirik> I often use ad-hoc things when gathering information or doing things to large groups (like our builders)
16:34:30 <mobrien> s/gether/gather
16:34:54 <mobrien> That is most of the config covered, any questions on that or will I move onto how to run them?
16:35:10 <mobrien> or nirik if I left anything important out?
16:35:34 <ComputerKid[m]> Do you ever deal with needing to run actions on different platforms with playbooks? And if so, how?
16:35:43 <ComputerKid[m]> I imagine it's all fedora...
16:35:58 <mobrien> ComputerKid[m], we have a lot of el8 hosts
16:36:14 <mobrien> So in the playbooks you'll see when statements to cover this
16:36:21 <mobrien> I'll find an example
16:36:41 <ComputerKid[m]> :D
16:36:59 <nirik> yeah, fedora, el7, el8, lots of different arches
16:37:07 <mobrien> https://pagure.io/fedora-infra/ansible/blob/main/f/roles/basessh/tasks/main.yml#_46
16:38:02 <mobrien> This is where the gather_facts in ansible really comes to the fore
16:38:37 <mobrien> there is a number of different setups you can cater for that way
16:38:55 <ComputerKid[m]> So does that act as an if stament on `make sure python3-libselinux is installed`?
16:39:20 <mobrien> Yep, `when` is `if` in ansible language
16:39:45 <ComputerKid[m]> That's cool. Thanks for explaining
16:40:14 <mobrien> no problem, any other questions don't hesitate to ask
16:40:37 <ComputerKid[m]> Who uses these playbooks?
16:40:48 <bodanel> everything is written in house or do you also use roles from galaxy ?
16:41:02 <mobrien> All in house as far as I know
16:41:19 <bodanel> nice
16:41:34 <ComputerKid[m]> 1+ nice
16:41:42 <mobrien> Most people involved in the infra will run a playbook at some stage or at least contribute and someone else run it for them
16:41:48 <dtometzki> cool
16:42:28 <ComputerKid[m]> How do you become part of infra?
16:42:31 <mobrien> So on to how they are run ...
16:42:43 <mobrien> The playbooks must be run from the batcave as there are some hardlinks such as the private repo mentioned earlier.
16:43:00 <mobrien> The batcave is like a bastion server in our infra
16:43:04 <bodanel> batcave i suppose is a hardened server
16:43:14 <mobrien> bodanel, exactly
16:43:28 <austinpowered> The master.yml playbook says it has all playbooks and gets run with tags. Do you have scripts that call master.yml with given tags for certain tasks?
16:43:31 <nirik> it also has the access to everything else via ssh that ansible needs.
16:43:57 <ComputerKid[m]> I suppose Infra SSHes into it?
16:44:04 <nirik> austinpowered: we renamed it recently actually, it's 'main.yml' now. ;)
16:44:32 <mobrien> austinpowered, more so that it is run and the desired tag specified based on what the runnner would like to do
16:44:37 <nirik> there's a cron job that runs nightly that runs that with --check --diff mode... so we can see what it would have changed if it changed things.
16:44:38 <austinpowered> I guess I need to do another pull. ;)
16:44:43 <bodanel> any issues when moving from master to main- heard some horror stories ?
16:45:16 <mobrien> bodanel, I wasn't involved in it personally but seemed well handled
16:46:15 <bodanel> thks
16:46:15 <mobrien> To access the batcave you would need some permissions in fas, I can never remember which ones but nirik will be able to answer that
16:46:19 <nirik> not too much hassle
16:46:38 <nirik> you have to be in a sysadmin-something group or fi-apprentice
16:47:05 <mobrien> The playbooks can be run using the rbac-playbook command (Role Based Access Control https://bitbucket.org/tflink/ansible_utils/src/master/)
16:47:22 <mobrien> This command assumes that you are in the playbooks directory of the ansible repo so for example to run the sundries playbook which is located in `playbooks/groups/sundries.yml` you would run:
16:47:22 <mobrien> sudo -i rbac-playbook groups/sundries.yml
16:47:39 <mobrien> Permissions to run specific playbooks are controlled by what fas group a user is in. They are normally sysadmin-* i.e. sysadmin-mbs to run the mbs playbook.  Although not always that obvious :)
16:48:04 <mobrien> rbac-playbook accepts flags such as (-t) to specify to run only tasks with the tag specified or (-l) to limit to certain hosts.
16:48:04 <mobrien> It does not however allow for extra vars to be passed at run time for security reasons.
16:48:26 <dtometzki> wgat is the process to get the group permissions ?
16:48:35 <dtometzki> what
16:49:06 <mobrien> If you require permissions to run a playbook raise a ticket on the fedora-infra tracker https://pagure.io/fedora-infrastructure/issues and specify a reason you need the permission
16:49:34 <mobrien> that goes for permissions in general in the infra
16:49:51 <dtometzki> ah ok
16:50:08 <bodanel> with fi-aprentice group can you run ad-hoc setup module to look at certain facts ?
16:50:27 <nirik> nope...
16:50:43 <nirik> fi-apprentice has read-only access. they can't run playbooks or anything.
16:51:19 <bodanel> running setup is not considered read-only
16:51:21 <bodanel> I got it
16:52:07 <bodanel> i was thinking since it only runs the setup module it gathers facts
16:52:15 <bodanel> and just displays them
16:52:27 <bodanel> but I understand the logic behind denying that
16:52:54 <mobrien> Thats about all I have. Any questions?
16:53:02 <mobrien> Or additions nirik?
16:53:44 <bodanel> not from me. If ansible is considered the source of truth is a good place to start getting dirty
16:53:53 <bodanel> hands dirty
16:53:56 <nirik> yeah, but it means you can ssh to the host and gather those things... so it's hard to allow that and prevent anything else. ;)
16:53:59 <Zlopez[m]> We could switch to "Learning topic discussion" topic
16:54:37 <mobrien> Zlopez[m], haha, sorry my bad
16:54:53 <mobrien> should have done that at the start
16:54:55 <ComputerKid> Thanks guys, this has been educational
16:55:30 <austinpowered> I know there is an ansible command to list tags. But how do you run that against this repo?
16:55:39 <mobrien> no prob ComputerKid, feel free to reach out on #fedora-admin if you have any questions in the future
16:56:05 <nirik> austinpowered: you should just be able to check out the repo and run it against that...
16:56:35 <Zlopez[m]> mobrien: We are in correct topic for the learning topic, but for the discussion we have another, so it's easier to read in log
16:56:46 <mobrien> austinpowered, generally its a good idea to look at some of the tasks that you wish to run to ensure the tag does what you think it should
16:56:47 <austinpowered> I have the repo. I'll look at using the command against a directory
16:57:03 <mobrien> #topic Learning topic discussion
16:57:21 <austinpowered> out of time - thanks again
16:58:05 <mobrien> I guess we ran over so we don't really have time for open floor. does anyone have anything quick they wish to discuss?
16:58:11 <mobrien> #topic Open Floor
16:58:34 <darknao> I have, but I'll ask in the main channel instead
16:58:44 <mobrien> thanks darknao
16:58:52 <mobrien> #endmeeting