#5 NVIDIA akmods doesn't build on kernel-xanmod-exptl
Closed 2 years ago by rmnscnce. Opened 2 years ago by tayler.

The automatic build and manually invoking sudo akmods fails to build the NVIDIA kernel module. Here's the relevant part of the build log, I believe:

2021/07/04 13:37:52 akmodsbuild: make[2]: *** [scripts/Makefile.build:272: /tmp/akmodsbuild.L4FuK4wh/BUILD/nvidia-kmod-470.42.01/_kmod_build_5.13.0-xm1.1e20210630.fc34.x86_64/nvidia/nv.o] Error 1
2021/07/04 13:37:52 akmodsbuild: make[2]: *** Waiting for unfinished jobs....
2021/07/04 13:37:52 akmodsbuild: cc: error: unrecognized command-line option '-Qunused-arguments'
2021/07/04 13:37:52 akmodsbuild: make[2]: *** [scripts/Makefile.build:272: /tmp/akmodsbuild.L4FuK4wh/BUILD/nvidia-kmod-470.42.01/_kmod_build_5.13.0-xm1.1e20210630.fc34.x86_64/nvidia/nv-pci.o] Error 1
2021/07/04 13:37:52 akmodsbuild: cc: error: unrecognized command-line option '-Qunused-arguments'
2021/07/04 13:37:52 akmodsbuild: make[2]: *** [scripts/Makefile.build:272: /tmp/akmodsbuild.L4FuK4wh/BUILD/nvidia-kmod-470.42.01/_kmod_build_5.13.0-xm1.1e20210630.fc34.x86_64/nvidia/nv-acpi.o] Error 1
2021/07/04 13:37:52 akmodsbuild: cc: error: unrecognized command-line option '-Qunused-arguments'
2021/07/04 13:37:52 akmodsbuild: make[2]: *** [scripts/Makefile.build:272: /tmp/akmodsbuild.L4FuK4wh/BUILD/nvidia-kmod-470.42.01/_kmod_build_5.13.0-xm1.1e20210630.fc34.x86_64/nvidia/nv-cray.o] Error 1
2021/07/04 13:37:52 akmodsbuild: make[1]: *** [Makefile:1856: /tmp/akmodsbuild.L4FuK4wh/BUILD/nvidia-kmod-470.42.01/_kmod_build_5.13.0-xm1.1e20210630.fc34.x86_64] Error 2
2021/07/04 13:37:52 akmodsbuild: make[1]: Leaving directory '/usr/src/kernels/5.13.0-xm1.1e20210630.fc34.x86_64'
2021/07/04 13:37:52 akmodsbuild: make: *** [Makefile:80: modules] Error 2
2021/07/04 13:37:52 akmodsbuild: error: Bad exit status from /var/tmp/rpm-tmp.DnAqHf (%build)

Looks like the patched top-level Makefile hasn't been enforcing the LLVM toolchain properly

New build is coming, please confirm the issue after updating

Metadata Update from @rmnscnce:
- Issue assigned to rmnscnce

2 years ago

New build still fails to build, attaching log
470.42.01-1-for-5.13.0-xm2.0e20210705.fc34.x86_64.failed.log

It looks like you don't have the LLVM toolchain installed. Please install them (clang, lld, and llvm) first, retry akmods building and then confirm the issue

All of those packages were correctly installed as dependencies of the -devel package. I tried again just now but it still failed to build the akmod.

All of those packages were correctly installed as dependencies of the -devel package. I tried again just now but it still failed to build the akmod.

Weird. It correctly uses the LLVM toolchain on me but I got hit by #6 instead, which is a much
more serious issue

I think I also got to that after passing in some env variables to the akmods command. Got to some permission denied for ld.lld issues.

I think I also got to that after passing in some env variables to the akmods command. Got to some permission denied for ld.lld issues.

Could you please try the latest build of exptl (5.13.1-xm1.0e20210708) which uses full LTO instead of ThinLTO? It seems to fix the akmods building issue

I've installed the newest build. The akmods would not build automatically or when invoking sudo akmods as it still seems to not properly set up the LLVM build environment, but I did successfully build it with sudo HOSTCC=clang CC=clang akmods

I've installed the newest build. The akmods would not build automatically or when invoking sudo akmods as it still seems to not properly set up the LLVM build environment, but I did successfully build it with sudo HOSTCC=clang CC=clang akmods

That's really weird. The top-level Makefile provided by kernel-xanmod-exptl-devel should make the build system exclusively use the LLVM toolchain.

Could you please send the output this command, please:

grep CC /usr/src/kernels/5.13.1-xm1.0e20210708.fc34.x86_64/Makefile | head -n 7

Example output:

#         cmd_cc_o_c       = $(CC) $(c_flags) -c -o $@ $<
HOSTCC  = clang
HOSTCC  = clang
# Make variables (CC, etc...)
CPP     = $(CC) -E
CC      = clang
CC      = clang

That's exactly what I see as well:

#         cmd_cc_o_c       = $(CC) $(c_flags) -c -o $@ $<
HOSTCC  = clang
HOSTCC  = clang
# Make variables (CC, etc...)
CPP     = $(CC) -E
CC      = clang
CC      = clang

I'm trying to think of any customizations on my system that might be relevant, but it's a pretty fresh installation. I use zsh as my session shell but I don't think that would affect anything.

That's exactly what I see as well:
\# cmd_cc_o_c = $(CC) $(c_flags) -c -o $@ $< HOSTCC = clang HOSTCC = clang \# Make variables (CC, etc...) CPP = $(CC) -E CC = clang CC = clang

I'm trying to think of any customizations on my system that might be relevant, but it's a pretty fresh installation. I use zsh as my session shell but I don't think that would affect anything.

any customizations on my system that might be relevant

Pretty much what I'm thinking too. Looks like you have something that "hardcodes" CC=gcc or CC=ccache gcc in your environment variables. You may try opening a shell and looking at the output of the command env

There's nothing in my login environment referencing either HOSTCC or CC in my environment... Is your patched makefile supposed to set those variables for the entire desktop session? Or is it specific to building the akmods, and/or only when you're booted into the exptl kernel?

There's still nothing in the env referencing those variables if I start up bash instead of zsh to bypass anything in my .zshrc that might be interfering.

I just did a fresh installation on a VM and the user environment does not have CC or HOSTCC as an additional point of information.

There's nothing in my login environment referencing either HOSTCC or CC in my environment... Is your patched makefile supposed to set those variables for the entire desktop session? Or is it specific to building the akmods, and/or only when you're booted into the exptl kernel?

There's still nothing in the env referencing those variables if I start up bash instead of zsh to bypass anything in my .zshrc that might be interfering.

The Makefile patch is there to force make to use the LLVM toolchain. I'll spin up a Fedora 34 VM to test this behavior and see what happens

I'll spin up a Fedora 34 VM to test this behavior and see what happens

So, I've spun up a Fedora 34 LXDE Spin VM on VirtualBox and akmods build system worked well with the patched Makefile from kernel-xanmod-exptl-devel

Screenshot-20210712193557-744x596.png

Hmm, I just tested in that fresh f34 VM and the nvidia akmods did not build.

Screenshot_from_2021-07-12_10-48-21.png

Same errors in the log as my first messages here where it's not using the LLVM toolchain. I also confirmed that Makefile exists and has the right contents with grep CC /usr/src/kernels/5.13.1-xm1.0e20210708.fc34.x86_64/Makefile | head -n 7

This is a stab at the dark, but one salient difference in our screenshots is that my VM is using the default Gnome session. Not sure how that would affect anything, but this VM is otherwise completely vanilla. Have you tried building the NVIDIA kmod?

Hmm, I just tested in that fresh f34 VM and the nvidia akmods did not build.

Screenshot_from_2021-07-12_10-48-21.png

Same errors in the log as my first messages here where it's not using the LLVM toolchain. I also confirmed that Makefile exists and has the right contents with grep CC /usr/src/kernels/5.13.1-xm1.0e20210708.fc34.x86_64/Makefile | head -n 7

This is a stab at the dark, but one salient difference in our screenshots is that my VM is using the default Gnome session. Not sure how that would affect anything, but this VM is otherwise completely vanilla. Have you tried building the NVIDIA kmod?

I'll try now. Will update when the results are out

Alright, update:
Screenshot-20210713002130-744x596.png

After some digging, I can say it's because of NVIDIA forcing CC through its own top-level Makefile and Kbuild in the module source package (xorg-x11-nvidia-drv-kmodsrc). Unfortunately, this issue cannot be (legally) solved even by RPMFusion. NVIDIA has to solve this issue themselves. You will have to use CC=clang HOSTCC=clang LD=ld.lld LLVM=1 LLVM_IAS=1 for the time being to use NVIDIA on systems running Linux with LLVM+LTO.

I suggest reporting this to NVIDIA in their forums and wait for any updates from them. Linux officially supports LLVM toolchain -- it's not a random patchwork from people on the internet, so it makes sense for NVIDIA to support it for the kernel modules, too.

Also, I will close this issue and put a notice on the Copr page in the installation instructions for users. Thank you for filing this ticket.

```
rmnscnce

Ticket status:

OPEN → CLOSED NOTABUG

Metadata Update from @rmnscnce:
- Issue status updated to: Closed (was: Open)

2 years ago

Metadata Update from @rmnscnce:
- Assignee reset

2 years ago

Metadata Update from @rmnscnce:
- Issue tagged with: notabug

2 years ago

Login to comment on this ticket.