fedora_coreos_meeting
LOGS
16:29:52 <jlebon> #startmeeting fedora_coreos_meeting
16:29:52 <zodbot> Meeting started Wed Jun 30 16:29:52 2021 UTC.
16:29:52 <zodbot> This meeting is logged and archived in a public location.
16:29:52 <zodbot> The chair is jlebon. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:29:52 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
16:29:52 <zodbot> The meeting name has been set to 'fedora_coreos_meeting'
16:30:05 <lucab> .hi
16:30:06 <zodbot> lucab: lucab 'Luca Bruno' <lucab@redhat.com>
16:30:06 <dustymabe> .hi
16:30:09 <zodbot> dustymabe: dustymabe 'Dusty Mabe' <dusty@dustymabe.com>
16:30:12 <jlebon> #topic roll call
16:30:16 <jbrooks> .hello jasonbrooks
16:30:18 <zodbot> jbrooks: jasonbrooks 'Jason Brooks' <jbrooks@redhat.com>
16:30:20 <fifofonix> .hi
16:30:24 <jlebon> #chair lucab dustymabe jbrooks fifofonix
16:30:24 <zodbot> Current chairs: dustymabe fifofonix jbrooks jlebon lucab
16:30:24 <zodbot> fifofonix: fifofonix 'Fifo Phonics' <fifofonix@gmail.com>
16:30:31 <jaimelm> .hello2
16:30:31 <bgilbert> .hi
16:30:31 <zodbot> jaimelm: jaimelm 'Jaime Magiera' <jaimelm@umich.edu>
16:30:34 <zodbot> bgilbert: bgilbert 'Benjamin Gilbert' <bgilbert@backtick.net>
16:30:43 <miabbott> .hello miabbott
16:30:44 <zodbot> miabbott: miabbott 'Micah Abbott' <miabbott@redhat.com>
16:30:56 <jlebon> #chair jaimelm bgilbert miabbott
16:30:56 <zodbot> Current chairs: bgilbert dustymabe fifofonix jaimelm jbrooks jlebon lucab miabbott
16:31:01 <davdunc> .hello2
16:31:02 <zodbot> davdunc: davdunc 'David Duncan' <davdunc@amazon.com>
16:31:02 <skunkerk> .hello sohank2602
16:31:04 <zodbot> skunkerk: sohank2602 'Sohan Kunkerkar' <skunkerk@redhat.com>
16:31:35 <jlebon> #chair davdunc skunkerk
16:31:35 <zodbot> Current chairs: bgilbert davdunc dustymabe fifofonix jaimelm jbrooks jlebon lucab miabbott skunkerk
16:32:26 <travier> .hello siosm
16:32:29 <zodbot> travier: siosm 'Timothée Ravier' <travier@redhat.com>
16:33:09 <jlebon> #chair travier
16:33:09 <zodbot> Current chairs: bgilbert davdunc dustymabe fifofonix jaimelm jbrooks jlebon lucab miabbott skunkerk travier
16:33:17 <jlebon> ok, 30s more :)
16:34:12 <darkmuggle> .hello2
16:34:13 <zodbot> darkmuggle: darkmuggle 'None' <me@muggle.dev>
16:34:20 <jlebon> #chair darkmuggle
16:34:20 <zodbot> Current chairs: bgilbert darkmuggle davdunc dustymabe fifofonix jaimelm jbrooks jlebon lucab miabbott skunkerk travier
16:34:25 <jlebon> alrighty, welcome all! let's get started
16:34:33 <jlebon> #topic Action items from last meeting
16:34:41 <jlebon> * dustymabe to create ticket about making single node optimizations that don't enhance kubernetes and possible ways to integrate better with k8s distributions
16:34:53 <jlebon> #info dustymabe filed https://github.com/coreos/fedora-coreos-tracker/issues/880
16:35:00 <dustymabe> 👆
16:35:05 <jlebon> and that's all!
16:35:10 <travier> 👍
16:35:32 <jlebon> ok, we've got quite a few tickets today, so it's not likely we'll get through all of them
16:35:35 <jlebon> let's start with...
16:35:49 <jlebon> #topic Potential security vulnerability related to metadata.google.internal usage on GCP
16:35:53 <jlebon> #link https://github.com/coreos/fedora-coreos-tracker/issues/885
16:36:04 <jlebon> travier: want to introduce this one?
16:36:42 <travier> Sure
16:37:41 <travier> This is a report about a potential issue with the GCP internal metadata domain name
16:38:18 <travier> The original report does not mention nor concern FCOS but this might impact us wrt Afterburn or Ignition
16:38:39 <travier> Discussion is ongoing regarding wether or nor we are impacted and if we should make changes
16:39:34 <travier> This would only be of concerns if an attacker has local network access in your cluster (compromised host or container with local network access) and only on node first boot.
16:40:08 <travier> EOI (end of introduction)
16:40:42 <bgilbert> (or if the attacker has remote network access and you're not using the GCE firewall to block DHCP packets)
16:41:02 * jaimelm will use that in conversion "Hi Bob, this is Jane. EOI."
16:41:19 <jlebon> should we switch afterburn and ignition to use IPs anyway?
16:42:04 <travier> If they are guaranteed to be stable then I guess this would remove any potential DNS issue from first boot which could be a good thing?
16:42:12 <bgilbert> (also, do we not run Afterburn SSH key fetching on every boot on GCP?)
16:42:38 <lucab> FWIW, I'd like to see GCP exposing an HTTPS endpoint and us using whatever valid SAN is in there
16:43:08 <lucab> the canonical endpoint is the DNS name: https://cloud.google.com/compute/docs/storing-retrieving-metadata
16:44:03 <darkmuggle> Missing in this conversation is that the vulnerability is specific to ISC DHCP Client
16:44:11 <lucab> yes we run Afterburn SSH key fetching on each boot on GCP
16:44:11 <jaimelm> I agree with darkmuggle's comment
16:44:12 <darkmuggle> We're using NetworkManager, which used a different XID calculation.
16:44:21 <jaimelm> in the ticket
16:45:07 <bgilbert> there are multiple factors mitigating the vulnerability for FCOS.  it may still be worth switching to using the IP for defense in depth.
16:46:30 <darkmuggle> Do we have any guarantee that the IP will not change?
16:48:12 <bgilbert> currently unknown, I think.  we should probably not expect to come to any conclusions in this meeting.
16:48:46 <darkmuggle> If its more unlikely for a take over with NM than ISC DHCP, then the prudent course would be to see what GCP does.
16:48:56 <jlebon> ok, so it sounds like right now we should just stand by for now
16:49:11 <jlebon> we should also see what the Security Team says
16:49:21 <travier> +1
16:49:39 <lucab> let's track the "switch to IP" as a ticket on either Afterburn or Ignition?
16:50:13 <jlebon> or maybe another tracker ticket since we'd probably want to do the same to both?
16:50:19 <jaimelm> ^^
16:50:30 <jlebon> bgilbert: could you file that?
16:50:59 <bgilbert> I could, but I was assuming we'd use the existing ticket for that
16:51:26 <bgilbert> either we should switch to IP to mitigate this type of vulnerability, or we should not switch because it isn't worth doing to mitigate this type of vulnerability
16:52:17 <bgilbert> actually, I've just talked myself out of that, nvm
16:52:17 <jlebon> ahh ok, wasn't sure if lucab was implying it'd be something worth doing on its own
16:52:31 <lucab> fair, although I would have assume this ticket to be closed in a few days once we confirm that NM is unaffected
16:52:39 <bgilbert> #action bgilbert to file ticket to consider switching Ignition/Afterburn to reference GCE metadata server by IP
16:52:48 <jlebon> ok cool :)
16:52:54 <jlebon> let's move on
16:53:09 <jlebon> #topic tracker: Fedora 35 changes considerations
16:53:12 <jlebon> #link https://github.com/coreos/fedora-coreos-tracker/issues/856
16:53:38 <jlebon> dustymabe: want to discuss that one?
16:54:37 <jlebon> hmm, he might be AFK.  i'll take it
16:54:44 <jaimelm> he appears to be missing in action. let's move on?
16:54:56 <jlebon> dustymabe filed a separate tracker issue for each f35 change we should look into
16:55:22 <dustymabe> here
16:55:23 <jlebon> so now we need to actually look at each of them and assess the situation
16:55:39 <dustymabe> yep. I filed sub tickets for each item we wanted to discuss further
16:55:56 <jlebon> we were thinking of just assigning them all to different people to share the load
16:56:00 <dustymabe> can we get volunteers to own running the investigations/discussion to ground?
16:56:27 <dustymabe> #link https://github.com/coreos/fedora-coreos-tracker/issues?q=is%3Aissue+is%3Aopen+label%3AF35-changes
16:57:16 <jlebon> dustymabe: there's so many, maybe it's easier to build a pool of people, and then we randomly assign them?
16:57:38 <darkmuggle> I'll take https://github.com/coreos/fedora-coreos-tracker/issues/873
16:57:40 <dustymabe> yeah, though I do think some of us here are experts in some topics more than others
16:57:56 <dustymabe> i.e. rpm related things - clearly someone who works on rpm-ostree would be more qualified
16:58:18 <dustymabe> jlebon: whatever you think works best
16:58:45 <jlebon> ok i took the RPM 4.17 and the "Fedora Linux" one
16:58:51 * jaimelm will take 872
16:59:00 * dustymabe will take #879
16:59:00 <darkmuggle> I'll take the SSSD one 875 too
16:59:09 <jlebon> bgilbert, lucab, darkmuggle, walters, travier: could each of you at least take one?
16:59:30 <jlebon> darkmuggle, jaimelm: +1
17:00:05 <dustymabe> spoken for #873, 877, 874, 872, 879, 875
17:00:39 <dustymabe> left: 876, 878
17:00:51 <jlebon> jaimelm: what's your GitHub handle again?
17:01:27 <dustymabe> #action jaimelm to investigate F35: CHANGE: CompilerPolicy Change #872
17:01:37 <lucab> I'm taking #876
17:01:37 <jaimelm> JaimeMagiera
17:01:44 <dustymabe> #action darkmuggle to investigate F35: CHANGE: DNS Over TLS #873
17:02:03 <dustymabe> #action jlebon to investigate F35: CHANGE: "Fedora Linux" in /etc/os-release #874
17:02:23 <dustymabe> #action darkmuggle to investigate F35: CHANGE: More flexible use of SSSD fast cache for local users #875
17:02:46 <dustymabe> #action lucab to investigate F35: CHANGE: OpenSSL3.0 #876
17:03:05 <dustymabe> #action jlebon to investigate F35: CHANGE: RPM 4.17 #877
17:03:21 <dustymabe> #action dustymabe to investigate F35: CHANGE: Remove nscd #879
17:03:31 <dustymabe> i think the only one not spoken for is #878
17:03:34 <jlebon> walters: i assigned you #878. feel free to unassign yourself if you're not interested :)
17:03:39 <dustymabe> F35: CHANGE: DNF/RPM Copy on Write enablement for all variants
17:04:03 <dustymabe> #action walters to investigate F35: CHANGE: DNF/RPM Copy on Write enablement for all variants #878
17:04:04 <jlebon> i think that's all of them
17:04:10 <dustymabe> \o/
17:04:17 <dustymabe> there's probably new ones we need to add to the list
17:04:20 <dustymabe> but that's a good start
17:04:27 <dustymabe> thanks all for volunteering
17:04:30 <jlebon> thank you all!
17:04:32 <jlebon> ok, let's move on
17:04:44 <jlebon> #topic Differing behavior for aarch64 vs x86_64 disk images
17:04:49 <jlebon> #link https://github.com/coreos/fedora-coreos-tracker/issues/855
17:05:05 <jlebon> dustymabe: want to (re-)introduce this one?
17:05:25 <dustymabe> mostly just read https://github.com/coreos/fedora-coreos-tracker/issues/855#issuecomment-867131221
17:05:41 <dustymabe> and then please vote here in chat with A or B
17:06:13 <dustymabe> A => match partition numbers, enhance documentation
17:06:19 <dustymabe> B => don't make any changes, enhance documentation
17:06:36 <bgilbert> B
17:06:41 <jaimelm> B
17:06:52 <darkmuggle> B
17:06:54 <dustymabe> A
17:06:59 <jlebon> B
17:07:30 <travier> A
17:07:59 * dustymabe sets timer to 30 seconds
17:08:20 <miabbott> B
17:08:36 <jlebon> i'd be open to revisit if multiple users get sufficiently confused by this
17:08:55 <lucab> (I'm skipping as I don't have a good idea)
17:09:24 <dustymabe> #agreed we will keep our paritioning set up for different architectures as it currently is and enhance our documentation for #855
17:09:24 <jaimelm> B: 5 A:2 Abstain:1
17:09:41 <travier> 👍
17:09:41 <dustymabe> all good with the agreed? otherwise I'll undo
17:09:56 <jlebon> +1
17:10:01 * dustymabe yields floor back to jlebon
17:10:13 <jlebon> thanks dustymabe!
17:10:56 <jlebon> #topic policy: setting single node defaults that don't enhance kubernetes
17:10:59 <jlebon> #link https://github.com/coreos/fedora-coreos-tracker/issues/880
17:11:38 <jlebon> dustymabe: do you want to introduce this one too, or should I? i think most of us are familiar since we discussed it last week already
17:11:48 * jaimelm is familiar
17:12:49 <dustymabe> jlebon: if you don't mind, you
17:13:07 <jlebon> ack ok
17:14:43 <jlebon> so I think at this point, it boils down to: "do we want to try to adapt services which conflict with k8s dynamically, or do we leave it to the k8s stack to configure FCOS accordingly"
17:15:56 <jlebon> to take the example of systemd-oomd, should we set it up so that e.g. if we detect k8s is in use, then we default to disabled? and also, how does that detection work?
17:16:54 <dustymabe> i think from our previous discussion we were leaning towards allowing ourselves to make changes that might not be useful for k8s
17:17:16 <dustymabe> the open question in my mind was how exactly (as jlebon mentioned)
17:17:27 <travier> I'd prefer if we don't do things for use cases we don't control as it will be really hard for k8s users to follow that (as we don't know which version of k8s is run)
17:17:31 <dustymabe> do we basically enable them by default and provide docs for kube integraters
17:17:32 <darkmuggle> I'd vote that we do containers really well, but provide the sugar for the Kube world.
17:17:35 <jlebon> bgilbert raised a good point last time which was that if we default to possibly breaking k8s changes, then we need a policy for rolling out those changes
17:18:09 <dustymabe> or do we try to auto detect and behave differently according to "kube or not"
17:18:13 <darkmuggle> We will never be able to engineering well against the unknowable versions and packages.
17:18:14 <travier> yes, I think that potentially k8s breaking changes should be new install
17:18:29 <travier> new install only*
17:19:13 <jlebon> right agreed re. version skew. we're just not tightly coupled enough to make sure we don't mess up
17:19:14 <travier> Documenting the unwanted features for k8s users with the versions they were introduced for operators should be reasonably doable
17:19:29 <travier> "unwanted"
17:20:37 <jaimelm> I don't want to speak for the engineering folks for OKD, but we do already diverge and plan to diverse where necessary from the standard FCOS. That's essentially what the machine-os repo does.
17:20:46 <jaimelm> diverge&
17:20:52 <travier> (I'm not 100% convinced yet we should focus no single node however)
17:21:08 <walters> darkmuggle: +1 to that
17:21:09 <travier> on*
17:21:28 <davdunc> yea I think the value is at scale.
17:21:35 <darkmuggle> I'd thought of another bad idea -- Butane plugin's akin to Terraforms -- to allow for the Kube world to make it easy for their users.
17:21:48 <jaimelm> heh
17:21:53 <travier> If we take the reverse argument, it's also easy to document which options would improve single node use cases in a doc page
17:22:19 <jaimelm> darkmuggle: you have good bad ideas (at first blush)
17:22:24 <darkmuggle> With Butane plugins the Cloud vendors and Kube vendors could write their own "sugar" for seeting defaults.
17:22:37 <dustymabe> travier: I think my counter to that is:
17:22:39 <jlebon> travier: but then, we're essentially neither defaulting to single or k8s
17:22:47 <darkmuggle> s/seeting/setting/
17:22:48 <jlebon> so it's the worst of both worlds
17:22:55 <travier> true
17:22:56 <dustymabe> for kube usually people are running some advanced ignition config, so it's not as hard to bring in a few extra bits
17:23:20 <dustymabe> for single node, they're usually starting much more "from scratch"
17:23:26 <jaimelm> much more
17:23:40 <bgilbert> the problem with the Butane approach is version skew.  there's no relationship between Butane versions and FCOS versions.  so the output of particular Butane sugar really can't change over time.
17:24:19 <bgilbert> I'd still lean toward an "API version" kind of interface
17:24:33 <jlebon> bgilbert: right, I don't think we should do sugar. but having helper butane configs we keep up to date for the latest FCOS should be fine, right?
17:24:52 <bgilbert> jlebon: sort of?
17:25:05 <travier> I don't think we should do sugar either. This would make Butane configs unknowable without the corresponding Butane binary
17:25:12 <darkmuggle> the plugin would support an "API"
17:25:40 <bgilbert> jlebon: if some k8s distro is using an old Ignition config, how do we know not to start enabling some conflicting behavior?
17:25:46 <bgilbert> darkmuggle: how does an API help?
17:26:05 <travier> Having Butane snippets in the doc alongside the notes about features introduced in version X would cover that better from my point of view
17:26:07 <dustymabe> maybe we should limit discussion to.. allow single node specific defaults to be applied? yes or no
17:26:22 <dustymabe> then we can probably talk about the best way to help the kube case
17:26:27 <jlebon> bgilbert: anytime you rebase your bootimages, you have to be aware of deprecation windows you're burning through
17:26:39 <jaimelm> travier: that can get messy
17:26:52 <darkmuggle> bgilbert: let's table the butane plugin idea as out of scope for this meeting, and I'll propose something in the tracker or discuss in higher bandwidth
17:27:03 <jaimelm> dustymabe: that would reign things in a bit. Probably a good idea.
17:27:17 <jaimelm> we're almost at time.
17:27:19 <jlebon> dustymabe: good idea
17:27:32 <bgilbert> jlebon: you should always be rebasing your bootimages.  this isn't RHCOS :-)
17:27:39 <jaimelm> ouch
17:27:42 <travier> +1 for single node focus if we carefully document things to disable for fk8s
17:27:53 <miabbott> i think the version skew complexity for k8s has already forced us to be more supportive of the single node case, so let's stop pretending we can support both single node/k8s reliably and focus on single node
17:28:36 <bgilbert> given that we're almost at time, I'd propose tabling.  seems like we have more discussion to do in the ticket
17:28:37 <jlebon> bgilbert: heh sure :)  but then that implies that your old Ignition config is not guaranteed to work forever on new machines
17:29:00 <jlebon> yeah ok, let's keep chatting in the tracker!
17:29:01 <dustymabe> any takeaways from this meeting?
17:29:04 <jaimelm> bgilbert +1
17:29:40 <jaimelm> we need to maybe prioritize meeting items differently. Something more than just the label. Seems like we keep pushing the edge.
17:29:54 <jlebon> dustymabe: hard to tell at this point :|
17:30:01 <jlebon> ok, let's to a quick open floor
17:30:14 <jlebon> #topic Open Floor
17:30:29 <jlebon> anything anyone wants to bring up?
17:30:32 <dustymabe> jaimelm: is there a topic you want to see that hasn't been discussed (starved by other topics)?
17:31:46 <dustymabe> jlebon: maybe discuss the video meeting. i.e. not having one next week because of people on vacations and such
17:31:47 <jaimelm> I'll add the item (documentation style) for next meeting.
17:31:57 <jlebon> dustymabe: ahhh yes yes
17:32:02 <dustymabe> we can still have an IRC meeting
17:32:07 <dustymabe> but probably not video
17:32:10 <jaimelm> move the video meeting or skip?
17:32:20 <jlebon> move one week
17:32:23 <jaimelm> (for this coming month)
17:32:30 <jaimelm> support
17:33:07 <dustymabe> that's all I had for open floor
17:33:14 <jlebon> dustymabe: that sounds reasonable to me, thanks for bringing it up!
17:33:20 <jlebon> ok, closing in 30s
17:33:49 <jlebon> #endmeeting