<@tflink:fedora.im>
17:30:36
!startmeeting fedora-ai-ml-sig
<@meetbot:fedora.im>
17:30:37
Meeting started at 2024-11-07 17:30:36 UTC
<@meetbot:fedora.im>
17:30:37
The Meeting name is 'fedora-ai-ml-sig'
<@trix:fedora.im>
17:32:00
quick someone else add something to meeting agenda before i start talking!
<@man2dev:fedora.im>
17:32:41
!hi
<@tflink:fedora.im>
17:32:42
yeah, I noticed that they were both from you :)
<@zodbot:fedora.im>
17:33:04
Mohammadreza Hendiani (man2dev)
<@tflink:fedora.im>
17:33:12
that being said, shall we get started?
<@tflink:fedora.im>
17:33:22
!topic ROCm APU
<@trix:fedora.im>
17:34:00
APU's are looking good.
<@tflink:fedora.im>
17:34:19
which ones? gfx1103?
<@trix:fedora.im>
17:34:29
I believe we are on track for a Fedora feature saying 'yeah pytorch on laptops'
<@trix:fedora.im>
17:34:41
1035 - M680
<@trix:fedora.im>
17:34:49
1103 - M780
<@trix:fedora.im>
17:35:01
1151 - M880 ??
<@trix:fedora.im>
17:35:14
i don't have the last one yet, it builds..
<@trix:fedora.im>
17:36:06
its this one https://www.amd.com/en/products/processors/laptop/ryzen/300-series/amd-ryzen-ai-9-hx-370.html
<@trix:fedora.im>
17:37:17
i am also trying to get ollama spun up so people with laptops can get llm's to do their jobs.
<@trix:fedora.im>
17:37:21
or whatever.
<@trix:fedora.im>
17:37:30
so fun stuff .
<@man2dev:fedora.im>
17:37:43
Oh btw
<@tflink:fedora.im>
17:37:48
cool, thanks for the update
<@man2dev:fedora.im>
17:38:41
Vulkan is Beyond unstable in nvidia since it relies on Vulcan layers
<@tflink:fedora.im>
17:39:38
!info progress has been made on enabling more mobile AMD GPUs to work with the pytorch stack - everything appears to be working
<@trix:fedora.im>
17:39:39
i am only working on ROCm things because that's what i get paid for, but if someone wants vulkan, we can do that
<@man2dev:fedora.im>
17:40:47
And And for whatever reason I'm getting way more crashes and I believe it's a c-group issue because Nvidia bypasses c-group protection and every kind of protection and sets the limits themselves
<@man2dev:fedora.im>
17:41:36
Which can easily cause overflow and crash or freezing of system
<@trix:fedora.im>
17:42:00
no surprise, i have no nvidia hw to help out with that problem 😊
<@man2dev:fedora.im>
17:42:20
Especially with LLMs that use a lot of memory.
<@trix:fedora.im>
17:42:44
if you want to use vulkan, try it on amd, i can help there.
<@tflink:fedora.im>
17:43:17
anyhow, anything else on AMD APUs?
<@trix:fedora.im>
17:43:27
nope. it's looking great!
<@tflink:fedora.im>
17:44:08
ok, moving on to ...
<@tflink:fedora.im>
17:44:11
!topic ROCm bundled llvm
<@tflink:fedora.im>
17:44:30
ah, this topic coming out of the fun of F40 and F41
<@trix:fedora.im>
17:44:36
yes.
<@tflink:fedora.im>
17:44:57
!info late releases of llvm have caused problems for ROCm in both F40 and F41
<@tflink:fedora.im>
17:45:12
have you been able to get ROCm working in F41 now?
<@trix:fedora.im>
17:45:16
i am trying to mitigate the risk of another very late llvm drop that breaks F42
<@trix:fedora.im>
17:46:25
F41, i am not sure. Jeremy Newton had some low parts of the stack for 6.2.1 that i +1's with you today.
<@trix:fedora.im>
17:46:39
so maybe they dribbling in soon.
<@tflink:fedora.im>
17:46:41
oh, that update is still sitting in testing?
<@trix:fedora.im>
17:46:46
yes.
<@tflink:fedora.im>
17:47:30
we should poke jeremy about that outside the meeting, he's the only one who can move that forward
<@trix:fedora.im>
17:47:35
lld fix going in at the last day before the freeze screwed rocm over.
<@trix:fedora.im>
17:48:16
apu == happy tom, llvm == mad tom.
<@tflink:fedora.im>
17:48:32
have you brought the topic up with FPC or is this more of a "we'll be prepared if stuff is broken at the last minute again" kind of thing
<@man2dev:fedora.im>
17:48:42
😂
<@trix:fedora.im>
17:49:12
fpc ?
<@tflink:fedora.im>
17:49:20
fedora packaging committee
<@mystro256:fedora.im>
17:49:37
!hi
<@zodbot:fedora.im>
17:49:37
None (mystro256)
<@tflink:fedora.im>
17:49:39
I suspect that bundling llvm will need a waiver
<@mystro256:fedora.im>
17:49:57
Is that needed anymore?
<@mystro256:fedora.im>
17:50:03
I thought policy changed
<@tflink:fedora.im>
17:50:20
note - the bundling is not turned on right now but it can be enabled in the spec files
<@man2dev:fedora.im>
17:50:24
Tom we do have the infra set up to work with upstream I believe forexample what if the Bundel and build was based of whatever and is using at the time
<@tflink:fedora.im>
17:50:43
there have been changes around bundling but I don't remember the details. I don't think that all bundling was allowed, though
<@man2dev:fedora.im>
17:50:53
But don't know if this would fallow packaging guidelines
<@trix:fedora.im>
17:51:17
a problem Fedora and all the distro's have is clang forks.
<@tflink:fedora.im>
17:51:33
yeah, there are "just a few" of those
<@tflink:fedora.im>
17:51:48
and by "just a few" I mean a ton
<@mystro256:fedora.im>
17:52:00
https://docs.fedoraproject.org/en-US/packaging-guidelines/#bundling
<@mystro256:fedora.im>
17:52:10
"Fedora packages SHOULD make every effort to avoid having multiple..."
<@trix:fedora.im>
17:52:19
When i asked about triton, i would told bundling was fine, and no toolchains would not handle it.
<@mystro256:fedora.im>
17:52:21
not a MUST
<@tflink:fedora.im>
17:52:38
"All packages whose upstreams allow them to be built against system libraries MUST be built against system libraries"
<@tflink:fedora.im>
17:52:59
but there's an argument in this case that we can't build against system llcm
<@tflink:fedora.im>
17:53:07
but there's an argument in this case that we can't build against system llvm
<@mystro256:fedora.im>
17:53:16
You MUST set Provides: bundled(llvm) == 18.0.0 (and same for clang, ldd, compiler-rt etc)
<@trix:fedora.im>
17:53:42
things like triton are build on a snapshot and plain don't work with system clang.
<@mystro256:fedora.im>
17:53:57
Yeah the problem is that upstream doesn't intend for upstream linking, but it can be done
<@mystro256:fedora.im>
17:54:11
so they don't "allow" it in an official way
<@mystro256:fedora.im>
17:54:26
but it's trivial to allow
<@mystro256:fedora.im>
17:54:56
rocm-llvm is a light fork
<@mystro256:fedora.im>
17:55:04
of llvm 18 (for ROCm 6.2)
<@tflink:fedora.im>
17:55:10
so forking of llvm is discouraged but using it for anything outside the upstream project is only not disallowed?
<@mystro256:fedora.im>
17:55:42
Maybe we should ask Fesco?
<@man2dev:fedora.im>
17:56:03
I believe there are certain exceptions that can be set. For example, I think it wa syncthing that has an exemption and is bundling basically a bunch of go language libraries inside of it?
<@tflink:fedora.im>
17:56:14
yeah, having a conversation with FESCo about this might be wise - either stop the last minute drops with late bugfixes or let us bundle
<@mystro256:fedora.im>
17:56:44
yeah honestly they need to come down hard on LLVM
<@mystro256:fedora.im>
17:56:51
or we need to bundle
<@trix:fedora.im>
17:57:35
i don't think we really have a choice, there are other clang forks, llvm is just a crappy project for allowing forks
<@mystro256:fedora.im>
17:58:06
Approaching FESCo will allow us to codify it then
<@tflink:fedora.im>
17:58:17
I can put together a ticket for FESCo unless someone else wants to do it
<@mystro256:fedora.im>
17:58:20
having it in writing that llvm is a forktastic project
<@mystro256:fedora.im>
17:58:28
Please
<@tflink:fedora.im>
17:58:57
!action tflink to write up FESCo ticket about the LLVM late landing problem
<@tflink:fedora.im>
18:00:40
anything else on this topic?
<@trix:fedora.im>
18:00:46
tflink: could you also include the f40 blender problem in that ?
<@trix:fedora.im>
18:01:12
F40 is going to be eol-ed before that thing is fixed.
<@mystro256:fedora.im>
18:01:26
Well the blender issue could be easily resolved if they allowed static linking
<@mystro256:fedora.im>
18:01:32
not sure if the symbol change fixed it
<@trix:fedora.im>
18:02:06
llvm guys are not really testing their dependent packages.
<@tflink:fedora.im>
18:02:10
I'll talk to you about the blender problem, not sure I understand that one well enough to make a coherant ticket about it
<@tflink:fedora.im>
18:02:42
to be fair, they really don't have the bandwidth to do all that testing. it's not an excuse to toss the hand grenade and walk away, though. IMHO
<@man2dev:fedora.im>
18:03:09
I think I'm getting why the FFmpeg people really love optimizing their code base by just writing it in assembly.
<@trix:fedora.im>
18:03:35
no one has the time to test. but late drops invalidate all the testing i did as i rolled out F40 and F41.
<@trix:fedora.im>
18:04:09
i really only test blender once or twice in a cycle, its a pain to set up.
<@trix:fedora.im>
18:04:39
but it is part of my normal build test.
<@tflink:fedora.im>
18:05:03
anything else on this topic?
<@trix:fedora.im>
18:05:18
sorry mad tom needs a smoke break ..
<@tflink:fedora.im>
18:06:35
no worries, it got rather crazy and it's a shame that ROCm wasn't quite working in the F41 release
<@man2dev:fedora.im>
18:06:58
Tom, if the blender people, or any of the packages have a build script already made inside the repo I recall seeing something about having support in the newer RPM specs.
<@man2dev:fedora.im>
18:07:12
Same story for tests
<@mystro256:fedora.im>
18:08:00
A suggestiong
<@mystro256:fedora.im>
18:08:10
we should have Fedora model llvm after debian
<@mystro256:fedora.im>
18:08:19
where all llvm packages are versioned
<@tflink:fedora.im>
18:08:27
what does debian do with llvm? I'm not familiar with that
<@mystro256:fedora.im>
18:08:30
llvm is a metapackage instead of an actually package
<@trix:fedora.im>
18:08:40
you mean like suse / tumbleweed ?
<@mystro256:fedora.im>
18:08:54
right now we have llvm and llvm18. I propose we add llvm19, and llvm is a metapackage requiring latest
<@mystro256:fedora.im>
18:09:01
Exactly
<@trix:fedora.im>
18:09:04
not that i have been looking at suse, i love you guys, really i do.
<@mystro256:fedora.im>
18:09:24
no need to last minute change anything, new llvm's are an addition of a new package instead of an update
<@trix:fedora.im>
18:09:50
yes, that would be better.
<@mystro256:fedora.im>
18:09:51
it would save us a lot of grief
<@tflink:fedora.im>
18:09:53
I think that topic came up in a bz thread or on devel@, didn't it?
<@mystro256:fedora.im>
18:10:13
I think we need to include this in the FESCo "demands" :)
<@tflink:fedora.im>
18:10:24
if I'm remembering correctly, there was resistance to doing that in Fedora but I don't remember the details
<@trix:fedora.im>
18:10:36
fwiw, suse has no lld-devel, so we need rocm-llvm there to get basic comgr going.
<@trix:fedora.im>
18:11:28
i would like rocm-llvm to be general enough that fedora-like distro could use it as is.
<@tflink:fedora.im>
18:11:55
I really hate my mail provider's web interface. I'll see if I can find the thread on devel@ (or wherever that was) after the meeting
<@mystro256:fedora.im>
18:12:00
sure, you can keep the logic in the spec file even if fedora doesn't use it
<@trix:fedora.im>
18:12:43
its working on tumbleweed now, you can see a few things need to be handled. but not much.
<@trix:fedora.im>
18:13:11
this is similar to get fedora things working on rhel.
<@mystro256:fedora.im>
18:13:35
I'm assuming RHEL is probably find with bundling LLVM
<@trix:fedora.im>
18:14:02
yes. i think not asking for 1/2 engineer to do the work would be a win for them.
<@tflink:fedora.im>
18:15:33
!info late llvm releases for Fedora continue to cause problems for ROCm, there are proposals as to how to deal with this problem but for now, the plan is to submit a ticket to FESCo and go from there once that conversation has happened
<@tflink:fedora.im>
18:15:45
anything else?
<@tflink:fedora.im>
18:15:51
on this topic
<@trix:fedora.im>
18:16:14
it is mostly working, i am up to the hip libs
<@trix:fedora.im>
18:16:36
and plan on having it functionally working for 6.3
<@tflink:fedora.im>
18:16:47
it?
<@trix:fedora.im>
18:16:49
as i think 6.3 will be the cutoff in F42
<@trix:fedora.im>
18:16:55
bundled llvm.
<@tflink:fedora.im>
18:17:17
ah. it's still disabled by default, though. right?
<@trix:fedora.im>
18:17:23
yes.
<@tflink:fedora.im>
18:18:10
ok, moving on to ...
<@tflink:fedora.im>
18:18:14
!topic open floor
<@tflink:fedora.im>
18:18:20
any other topics that folks wanted to bring up?
<@trix:fedora.im>
18:19:28
question in ai/ml.
<@trix:fedora.im>
18:19:49
heavy builders for fedora, anything you can speak of ?
<@man2dev:fedora.im>
18:20:12
I don't think I understand the question?
<@tflink:fedora.im>
18:20:19
nothing at the moment, I don't think that the budget for 25 has been fully decided yet
<@trix:fedora.im>
18:21:30
pytorch takes a long time to build, in past life, i asked for hw to allevate that .. hw == heavy builders, something much better than our basic builders
<@tflink:fedora.im>
18:22:08
there are heavy builders in koji but those are going EOL in 25 AFAIK. the proposal was to get some machines to replace some of the heavy builders that are going away
<@trix:fedora.im>
18:23:26
if we had a proposal for testing infra, it would be something i could shop around at amd for support.
<@man2dev:fedora.im>
18:23:45
https://dvprogram.state.gov
<@trix:fedora.im>
18:23:53
atm saying a bunch of machine on / under my desk doesn't cut it.
<@tflink:fedora.im>
18:24:03
I proposed it but I don't know what's happening with it after everyone moved around organizationally
<@tflink:fedora.im>
18:24:28
would it help to have machines in my basement?
<@tflink:fedora.im>
18:24:47
:-D
<@man2dev:fedora.im>
18:24:52
Universal-blue.org
<@trix:fedora.im>
18:25:07
if we could hook them up so others could use them or report the results publically that would be good.
<@tflink:fedora.im>
18:25:27
working on it but progress is slow now that it's only in my spare time
<@trix:fedora.im>
18:25:28
i don't have the bw to do that for my own machines.
<@tflink:fedora.im>
18:26:01
Mohammadreza Hendiani: I don't understand what you're getting at with the link
<@man2dev:fedora.im>
18:27:26
They bunch of DevOps people run this project and they do fairly big builds because they build Fedora from the ground up. I don't Remember what their build system uses but I know they have a lot of infrastructure set up for big builds
<@tflink:fedora.im>
18:28:08
I've talked with them a bit in the past but never really got into the details of their buildsystem
<@trix:fedora.im>
18:29:24
i try not to get involved in other people's buildsystems, just dealing with fedora's is enough, everyone solves the same problems but in different 'best' ways.
<@tflink:fedora.im>
18:30:04
anyhow, we're pretty much out of time unless there are any last minute topics
<@trix:fedora.im>
18:30:28
good meeting guys!
<@tflink:fedora.im>
18:30:35
thanks for coming, everyone
<@tflink:fedora.im>
18:30:44
!endmeeting