fedora_coreos_meeting
LOGS
16:30:29 <dustymabe> #startmeeting fedora_coreos_meeting
16:30:29 <zodbot> Meeting started Wed May 10 16:30:29 2023 UTC.
16:30:29 <zodbot> This meeting is logged and archived in a public location.
16:30:29 <zodbot> The chair is dustymabe. Information about MeetBot at https://fedoraproject.org/wiki/Zodbot#Meeting_Functions.
16:30:29 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
16:30:29 <zodbot> The meeting name has been set to 'fedora_coreos_meeting'
16:30:33 <dustymabe> #topic roll call
16:30:35 <dustymabe> .hi
16:30:37 <zodbot> dustymabe: dustymabe 'Dusty Mabe' <dusty@dustymabe.com>
16:30:49 <travier> .hello siosm
16:30:50 <zodbot> travier: siosm 'Timothée Ravier' <travier@redhat.com>
16:30:54 <mnguyen> .hello mnguyen
16:30:55 <zodbot> mnguyen: mnguyen 'Michael Nguyen' <mnguyen@redhat.com>
16:31:06 <dustymabe> #chair travier mnguyen
16:31:06 <zodbot> Current chairs: dustymabe mnguyen travier
16:31:12 <jlebon> .hello2
16:31:13 <zodbot> jlebon: jlebon 'None' <jonathan@jlebon.com>
16:32:34 <dustymabe> #chair jlebon
16:32:34 <zodbot> Current chairs: dustymabe jlebon mnguyen travier
16:33:02 <bgilbert> .hi
16:33:03 <zodbot> bgilbert: bgilbert 'Benjamin Gilbert' <bgilbert@backtick.net>
16:33:30 <dustymabe> #chair bgilbert
16:33:30 <zodbot> Current chairs: bgilbert dustymabe jlebon mnguyen travier
16:33:38 <dustymabe> ok let's get started soon
16:34:25 <dustymabe> #topic Action items from last meeting
16:34:31 <dustymabe> * dustymabe to invite grub and kernel folks to discuss this topic further
16:34:37 <dustymabe> #info dusty is working to get something scheduled - probably for June
16:34:46 <dustymabe> turns out some vacations made this hard to schedule for May
16:34:49 <spresti[m]> .hello2
16:34:50 <zodbot> spresti[m]: Sorry, but user 'spresti [m]' does not exist
16:35:10 <dustymabe> #chair spresti[m]
16:35:10 <zodbot> Current chairs: bgilbert dustymabe jlebon mnguyen spresti[m] travier
16:35:32 <ravanelli> .hi
16:35:32 <zodbot> ravanelli: ravanelli 'Renata Ravanelli' <renata.ravanelli@gmail.com>
16:35:38 <dustymabe> if anyone thinks the dbx update topic is urgent enough that we can't wait til june, let me know
16:36:33 <dustymabe> #topic New Package Request: ipcalc
16:36:39 <dustymabe> #link https://github.com/coreos/fedora-coreos-tracker/issues/1460
16:37:00 <dustymabe> jlebon: did we get the necessary information from the folks to unblock discussion here?
16:37:47 <jlebon> dustymabe: maybe
16:38:18 <jlebon> i see this more as a nice-to-have. it makes writing some network scripts easier
16:38:32 <jlebon> ideally, you don't have complex network scripts, but that's the reality we live in :)
16:39:42 <dustymabe> jlebon: any particular direction you lean?
16:39:57 <jlebon> leaning towards baking it in
16:40:02 <dustymabe> anybody else with opinions?
16:40:29 <jlebon> there's a bit of a soft bootstrapping problem where non-bash bits are usually fetched from the network (though it's possible to drop e.g. go binaries via Ignition too)
16:40:55 <jlebon> so i can certainly see how they got there
16:42:08 <dustymabe> I see colin just commented:
16:42:19 <dustymabe> cw: I'm +0.5 to this...it's small. But I think again ultimately we should be doing more of this stuff from container images, and we're already running containers from bootkube.sh.
16:42:24 <dustymabe> cw: The dispatcher script case...well, ultimately I think anything nontrivial like this needs to do "two phase" initialization anyways where we get a basic network setup going, enough to pull the container image, then re-setup with the dispatcher script in place.
16:42:33 <bgilbert> bootkube.sh isn't an FCOS thing, right?
16:42:57 <spresti[m]> I tend to agree with jlebon 's point of view. We can add this but should we? it feels like it could just exist in bash and not be a part of the shipped product?
16:43:37 <jlebon> bgilbert: correct, it's an OCP thing but the pattern isn't uncommon (drop bash script that does the full setup)
16:44:10 <dustymabe> I think I'm squarely in the middle - or maybe +0.1 :)
16:44:28 <jlebon> though TBH I don't know how e.g. Typhoon bootstraps
16:44:54 <dustymabe> I think typhoon focuses more on Cloud environments - so probably not too much custom networking required (but I could be wrong)
16:45:34 <dustymabe> #proposed we'll include ipcalc to aid in some advanced custom networking calculations.
16:45:46 <dustymabe> ^^ to move the conversation forward
16:45:48 <bgilbert> +0.5
16:45:58 <jlebon> +0.75
16:46:03 <quentin9696[m]> or the tool need to use the interface name instead of the IP address
16:46:19 <dustymabe> #chair quentin9696[m]
16:46:19 <zodbot> Current chairs: bgilbert dustymabe jlebon mnguyen quentin9696[m] spresti[m] travier
16:46:43 <dustymabe> +.25 from me
16:46:49 <travier> how nmstate on this?
16:46:52 <travier> how's*
16:47:02 <dustymabe> walters: was +.5
16:47:09 <dustymabe> travier: I don't understand the question
16:47:16 <jlebon> quentin9696[m]: i don't think that's their issue in this case
16:47:28 <travier> would nmstate help ?
16:47:58 <dustymabe> travier: I don't know. presumably they know about nmstate
16:48:00 <jlebon> the specific use case presented is not exactly about configuring the host network
16:48:17 <jlebon> it's about knowing which of the host IPs to report to the cluster as "the host IP"
16:49:19 <jlebon> based on CIDR configuration provided by the user at installation time
16:49:26 <dustymabe> any further discussion? any "no" votes?
16:50:23 <dustymabe> #agreed we'll include ipcalc to aid in some advanced custom networking calculations.
16:51:03 * dustymabe moves on to the next topic in 30s
16:51:25 <dustymabe> #topic increase size of our /boot partition for new installs
16:51:30 <dustymabe> #link https://github.com/coreos/fedora-coreos-tracker/issues/1465
16:52:05 <dustymabe> so this one is so we can discuss what we think the new size of our /boot partition should be
16:52:17 <dustymabe> and also the steps we need to take (and the challenges to overcome) to get there
16:52:38 <dustymabe> bgilbert: also brings up that we should maybe consider increasing the size of our ESP
16:53:05 <bgilbert> anyone with a separate /var is going to be inconvenienced no matter what size we pick
16:53:09 <dustymabe> there is a Fedora Change to increase the minimum size of the ESP to 500m
16:53:42 <bgilbert> so something like 512 MB ESP and 1 GB /boot seems reasonable for future-proofing IMO
16:54:32 <jlebon> 1G /boot matches anaconda too
16:54:52 <jlebon> (and interestingly, my ESP is 600M)
16:54:53 <dustymabe> I think you're probably right
16:55:00 <copperi> +1
16:55:09 <dustymabe> though it does feel like a large change
16:55:43 <dustymabe> right now ESP is 127 and /boot is 384
16:56:10 <dustymabe> so we'd be going to 3x the current size
16:56:18 <bgilbert> for context, an ESP larger than 257 MB lets us fix the longstanding issue that we're breaking the UEFI spec by using FAT16 in the ESP
16:56:32 <dustymabe> bgilbert: yes, a big win
16:57:25 <dustymabe> anybody think that 512 and 1G are too large ?
16:57:56 <dustymabe> one thing to consider is that anaconda is keeping dual boot use cases in mind
16:58:06 <dustymabe> where we explicitly aren't haven't
16:58:20 <bgilbert> do you mean for the ESP or /boot?
16:58:30 <dustymabe> both I would think
16:58:42 <dustymabe> the change proposal for making the ESP larger I think includes some language about it
16:58:43 <bgilbert> do people share /boot between OSes?
16:58:58 <dustymabe> As the ESP is often shared between Windows and Linux, and also used for firmware updates, and soon to be used by UKIs it's not enough to just allocate a few hundreds of megabytes.
16:59:34 <bgilbert> I'd think firmware update size is a larger factor than bootloader size, but that's a guess
16:59:44 <dustymabe> bgilbert: +1
16:59:47 <bgilbert> also, does the UKI part affect us?  would we start putting kernels in the ESP?
17:00:15 <dustymabe> it's honestly been a while since I dual-booted so not sure on the "is /boot shared" question
17:00:42 <dustymabe> though I doubt a windows+linux dual boot would have any sharing in /boot
17:00:58 <bgilbert> travier walters: ^ UKI question
17:01:26 <travier> (I'm sorry, I'm dual meeting, can not answer)
17:01:48 <dustymabe> bgilbert: good question though :)
17:02:09 <dustymabe> I think I'm good with 512 for ESP
17:02:19 <dustymabe> (at least that gets us in line with the fedora change proposal)
17:02:50 <dustymabe> and I guess 1G for boot is OK too
17:03:24 <dustymabe> should we try to ink that before we move on to other discussion (challenges)?
17:03:31 <bgilbert> we should confirm the UKI implications before committing to that sizing
17:03:53 <dustymabe> bgilbert: i.e. we may want a larger ESP and smaller /boot than the proposed ?
17:03:54 <bgilbert> wouldn't want to allocate 512 MB for kernels/initrds and 1 GB for nothing at all
17:03:58 <bgilbert> yeah
17:04:30 <jlebon> maybe we should revisit this after the change proposal has settled
17:04:31 <dustymabe> I feel like UKI is a bit of a ways off, but I agree we should probably try to get as much information now
17:04:45 <bgilbert> dustymabe: +1
17:04:45 <dustymabe> jlebon: the UKI change proposal?
17:05:21 <jlebon> dustymabe: the ESP size proposal. i think they're also talking about UKIs in that thread.
17:05:28 <bgilbert> wfm
17:05:36 <dustymabe> jlebon: ok
17:05:47 <dustymabe> for now we #info where we are and punt til next time?
17:05:55 <jlebon> SGTM
17:07:15 <dustymabe> #info For now the proposal is a 512M ESP and 1G /boot partition but we are following discussions about UKI to determine if that will change where (ESP or /boot) we put the large files (kernel+initramfs) in the future.
17:07:48 <dustymabe> ok so that can close off that piece of the discussion
17:08:08 <dustymabe> what about the challenges of implementing such a change? should we try to enumerate them?
17:09:47 <dustymabe> bgilbert: you mentioned /var earlier
17:09:59 <bgilbert> I think /var is the main one
17:10:21 <bgilbert> we do need to update the Butane templates, but strictly speaking that doesn't need to be synced with the OS image changes
17:11:15 <dustymabe> can we unpack that one? is this the "reprovision in place an existing system and try to persist data" problem?
17:11:15 <jlebon> bgilbert: because the Butane template throws out everything anyway?
17:11:17 <bgilbert> (probably it's better UX if the Butane changes happen before the image changes, but Butane isn't versioned with the OS, so not everyone will get the changes at once anyway)
17:11:27 <bgilbert> jlebon: right
17:11:48 <bgilbert> dustymabe: yup
17:12:09 <bgilbert> it's a supported use case, and (if people are actually reprovisioning to make config changes) possibly a common one
17:12:15 <dustymabe> so what are the steps involved there?
17:12:32 <bgilbert> my ~/TODO still has an entry to write up a bug for this
17:12:41 <dustymabe> ahh
17:12:44 <bgilbert> but AIUI we need to teach coreos-installer to:
17:12:51 <bgilbert> 1. detect that it's overwriting an existing CoreOS installation
17:13:37 <bgilbert> 2. compare the new image's partition table to the existing one
17:13:45 <bgilbert> 3. fail if we're about to clobber a non-OS partition
17:14:03 <bgilbert> that's the safety part.  but it leaves the user unable to reprovision
17:14:31 <dustymabe> bgilbert: at least unable to reprovision a newer version of the OS
17:14:39 <bgilbert> we don't support older versions of the OS
17:14:57 <bgilbert> we'll need to give lots of warning
17:15:18 <jlebon> if it's a non-shrinkable filesystem (so it could be moved down), they're pretty much stuck. they'd have to e.g. copy out/copy in manually if the data is really valuable
17:15:39 <bgilbert> jlebon: are there prominent non-shrinkable ones?  we don't need online shrink, since it's a data partition
17:15:50 <jlebon> bgilbert: XFS :)
17:16:17 <bgilbert> 😱
17:16:36 <bgilbert> I'm not a huge fan of that
17:16:46 <dustymabe> :)
17:17:08 <bgilbert> well
17:17:11 <dustymabe> either way. I think the minimum we need to do is make sure that we don't clobber anyone's data (safe)
17:17:19 <jlebon> this has been brought up many times before, but AFAIK there are no plans to implement it
17:17:34 <bgilbert> I guess we have to tell ~everyone to copyout/copyin then
17:18:16 <dustymabe> bgilbert: OR they decide they want to use the old sizes of partitions?
17:18:33 <dustymabe> they can update their ignition configs to do that, right?
17:18:53 <bgilbert> no
17:19:07 <jlebon> note there could also be people that don't use coreos-installer but their own tooling to write the image
17:19:12 <bgilbert> resizing happens on first boot, which means they need to write a complete OS image
17:19:39 <dustymabe> ahh I see
17:19:43 <bgilbert> we could conceivably ship a new "small.metal" image, but that adds a different kind of complexity
17:19:52 <bgilbert> and small.4k.metal
17:20:11 <dustymabe> 👎
17:20:13 <bgilbert> jlebon: yeah, not much we can do about
17:20:17 <bgilbert> *about that
17:20:51 <jlebon> bgilbert: if we were to create new artifacts, a better way to make use of that is to deprecate the old one and create a new one
17:21:02 <dustymabe> ok so let me pull it back to the top level a bit
17:21:12 <jlebon> but agreed that'd be very undesirable
17:21:37 <bgilbert> jlebon: yup, fair
17:21:41 <dustymabe> I'm reading: A. update the butane templates B. update coreos-installer to be aware and not clobber data
17:22:00 <dustymabe> anything else?
17:22:13 <jlebon> messaging
17:22:20 <dustymabe> for people's existing Ignition configs, anything we need to do?
17:22:56 <jlebon> we'd need a really long (and loud) deprecation process
17:23:35 <jlebon> Ignition configs need to make sure to use size-based specifications rather than offset-based
17:23:39 <dustymabe> jlebon: a Fedora Change Proposal (Fedora 40??) would help us get the word out
17:24:31 <bgilbert> dustymabe: if the user reprovisions the rootfs, none of this matters.  if they don't, but just set a starting offset for /var which is too early, Ignition will properly fail
17:24:54 <bgilbert> actually, yeah, a Fedora Change proposal would be appropriate
17:25:09 <bgilbert> (and then we get to explain our deployment model to devel@, fun)
17:25:14 <jlebon> i like it too
17:25:43 <bgilbert> "dustymabe: anything else?" > I'm not 100% confident we're not missing something
17:26:28 <dustymabe> bgilbert: jlebon: so if I read this right we don't actually need modify Ignition to detect anything. It already will fail.. Now just to get the user to know what change to make to their configs.
17:26:42 <bgilbert> +1
17:26:48 <jlebon> agreed
17:27:10 <dustymabe> ok cool
17:27:18 <dustymabe> I'll try to summarize this and put it in the ticket
17:27:26 <dustymabe> anything else before moving to open floor?
17:27:54 <dustymabe> bgilbert: do you want an #action for opening that more detailed ticket? it's already in your TODO so not sure if that would be useful
17:28:06 <bgilbert> nah :-)
17:28:12 <dustymabe> #topic open floor
17:28:42 <dustymabe> #info I created the F39 changes ticket https://github.com/coreos/fedora-coreos-tracker/issues/1491
17:29:17 <dustymabe> maybe a few of us can meet before next meeting to do a pre-screening like we have in the past and then discuss each change next wednesday?
17:29:26 <jlebon> dustymabe: can't even give us the luxury of a few months without looking at change proposals again :)
17:29:36 <dustymabe> jlebon: it's a viscious cycle
17:29:40 <jlebon> WFM
17:30:46 <dustymabe> reminder if you want something to be discussed during the meeting, please add the `meeting` label to the ticket
17:30:55 <Nemric> Hi, I'm not reaaly aware of logical volume and/or volume group, but my company ask me about that for trying FCOS
17:31:06 <dustymabe> if you don't have permissions to do that just mention it in #fedora-coreos and we'll get it added
17:31:21 <bgilbert> Nemric: you can write scripts to set up LVM by hand, but Ignition doesn't support it
17:31:31 <bgilbert> intentionally
17:31:53 <Nemric> ok :/
17:32:24 <bgilbert> LVM makes less sense if you're not planning to make changes to the node after it's originally provisioned
17:32:36 <bgilbert> (and FCOS emphasizes "reprovision the node" as the way to make changes)
17:32:39 <Nemric> is there a short answer for "intentionally" ?
17:32:42 <bgilbert> ^
17:33:08 <Nemric> ;)
17:33:38 <dustymabe> i was thinking there was a ticket where we'd talked about this in the past
17:33:41 <jlebon> i think there's nuances there that might be worth teasing out, but we're already over :)
17:33:59 <bgilbert> https://github.com/coreos/ignition/issues/1289
17:34:08 <dustymabe> bgilbert: +1
17:34:34 <bgilbert> yeah, for data volumes it's arguable
17:34:40 <dustymabe> i'll close out the meeting in 30s
17:34:53 <jlebon> bgilbert: that issue doesn't really reflect your above stance it seems?
17:35:45 <bgilbert> jlebon: yeah, sorry, there's more nuance than I put here
17:36:08 <dustymabe> Nemric: you can at least subscribe to https://github.com/coreos/ignition/issues/1289 :)
17:36:22 <dustymabe> #endmeeting