Problems adding nvidia proprietary drivers to initrd file

Spork_Schivago · January 31, 2018, 1:55am

Hello,

I’m trying to add the nVidia proprietary drivers and a modprobe.conf file to the initrd file.

I run dracut -f to create the initrd file. It fails.

These are the two files that I’ve created:


#/etc/dracut.conf.d/nvidia.conf

add_drivers+="nvidia nvidia_modeset nvidia_uvm nvidia_drm"
install_items+="/etc/modprobe.d/nvidia.conf"


#/etc/modprobe.d/nvidia.conf

options nvidia_drm modeset=1

The lines starting with the #'s aren’t actually in the file, that’s just the filenames. And there’s no new blank lines at the top of the files, I just added that so it’s a bit easier to read.

Anyway, I create these two files. Then I execute dracut -f and this is the output:


eugene:/mnt/initrd # dracut -f
dracut: Executing: /usr/bin/dracut -f
dracut: *** Including module: bash ***
dracut: *** Including module: systemd ***
dracut: *** Including module: warpclock ***
dracut: *** Including module: systemd-initrd ***
dracut: *** Including module: i18n ***
dracut: *** Including module: drm ***
dracut: *** Including module: plymouth ***
dracut: *** Including module: kernel-modules ***
dracut: *** Including module: resume ***
dracut: *** Including module: rootfs-block ***
dracut: *** Including module: suse-xfs ***
dracut: *** Including module: terminfo ***
dracut: *** Including module: udev-rules ***
dracut: Skipping udev rule: 40-redhat.rules
dracut: Skipping udev rule: 50-firmware.rules
dracut: Skipping udev rule: 50-udev.rules
dracut: Skipping udev rule: 91-permissions.rules
dracut: Skipping udev rule: 80-drivers-modprobe.rules
dracut: *** Including module: dracut-systemd ***
dracut: *** Including module: haveged ***
dracut: *** Including module: usrmount ***
dracut: *** Including module: base ***
dracut: *** Including module: fs-lib ***
dracut: *** Including module: shutdown ***
dracut: *** Including module: suse ***
dracut: *** Including modules done ***
dracut: *** Installing kernel module dependencies and firmware ***
dracut: *** Installing kernel module dependencies and firmware done ***
**dracut-install: ERROR: installing '/usr/lib/udev/rules.d/11-dm-parts.rules/etc/modprobe.d/nvidia.conf'**
**dracut: /usr/lib/dracut/dracut-install -D /var/tmp/dracut.3ipW9s/initramfs -a /usr/lib/udev/rules.d/11-dm-parts.rules/etc/modprobe.d/nvidia.conf**
dracut: *** Resolving executable dependencies ***
dracut: *** Resolving executable dependencies done***
dracut: *** Hardlinking files ***
dracut: *** Hardlinking files done ***
dracut: *** Stripping files ***
dracut: *** Stripping files done ***
dracut: *** Generating early-microcode cpio image ***
dracut: *** Constructing GenuineIntel.bin ****
dracut: *** Store current command line parameters ***
dracut: Stored kernel commandline:
dracut:  resume=UUID=ac985b5e-fb17-442f-b6ac-24c1c53e5bff
dracut:  root=UUID=3c6e7faf-093a-49df-83db-ca247620f093 rootfstype=ext4 rootflags=rw,relatime,data=ordered
dracut: *** Creating image file '/boot/initrd-4.14.15-1-default' ***
dracut: *** Creating initramfs image file '/boot/initrd-4.14.15-1-default' done ***

Not sure what I’m doing wrong here. For some reason, it seems like it’s looking for some /usr/lib/udev/rules.d/11-dm-parts.rules/etc/modprobe.d/nvidia.conf file.
/usr/lib/udev/rules.d/11-dm-parts.rules is a file that exists, but it appears there’s no space between that and the /etc/modprobe.d/nvidia.conf file I try adding to my image.

If I add a space before /etc/modprobe.d/nvidia.conf in the install_items+= area, like this:


add_drivers+="nvidia nvidia_modeset nvidia_uvm nvidia_drm"
install_items+=" /etc/modprobe.d/nvidia.conf"

I don’t actually see where dracut adds my files. I’ve tried extracting the contents of the ASCII CPIO file, but there’s two files inside, nothing else (when I add the space and the initramfs file is created successfully.


early_cpio  
kernel
  kernel->x86
    kernel->x86->microcode
      kernel->x86->microcode->GenuineIntel.bin

Neither of the files are big. 8KB for the GenuineIntel.bin file, and the early_cpio was 2 bytes.

Any ideas what I’m doing wrong here? I’m sure there’s more to the initrd file that I’m attempting to extract, but I couldn’t find any other ways to extract the files except for using:


cpio -id < /boot/initrd-4.14.15-1-default

I’m running the latest Linux proprietary drivers from nVidia, I’m also running kernel 4.14.15-1-default. I did a zypper dup the other night because I haven’t used this box in a while.

Any help would be greatly appreciated.

Thank you.

Spork_Schivago · January 31, 2018, 2:15am

I’d like to add that I also tried using dracut’s skipcpio and piping the output to cpio. I just found out about this command, so far, it just errors out and complains about premature end of file if I run it like this.


/usr/lib/dracut/skipcpio initrd-4.14.15-1-default | cpio -i -d -H newc --no-absolute-filenames

If I remove the -H newc, I get lots of garbage. I’ll read the man page on it and see what I can find out.

Thanks.

Spork_Schivago · January 31, 2018, 2:37am

Spork_Schivago:

I’d like to add that I also tried using dracut’s skipcpio and piping the output to cpio. I just found out about this command, so far, it just errors out and complains about premature end of file if I run it like this.
/usr/lib/dracut/skipcpio initrd-4.14.15-1-default | cpio -i -d -H newc --no-absolute-filenames
If I remove the -H newc, I get lots of garbage. I’ll read the man page on it and see what I can find out.

Thanks.

I found a way to extract, using the following command:


/usr/lib/dracut/skipcpio initrd-4.14.15-1-default | unxz | cpio -ivd

Looks like the files are there, when I add the space at least the the insert_files+=" <filename> "

I also fixed my modprobe nvidia.conf file so it has options for nvidia-drm, instead of nvidia_drm

The man page for draculat.conf shows spaces for the before and after the filename for insert_files+=. I just didn’t think they where being added. I think we’re good now and I figured it out myself. If anything looks wrong, please feel free to let me know.

xorbe · January 31, 2018, 3:05am

What do you hope to achieve by doing this? I’m just curious. I don’t remember doing anything like this when I installed my nvidia driver manually the first time. I install the driver, and run mkinitrd (no idea if that’s actually needed though).

arvidjaar · January 31, 2018, 4:57am

Use /usr/lib/dracut/skipcpio to skip early cpio archive. Or lsinitrd to see content (file list) of initrd.

Spork_Schivago · January 31, 2018, 12:58pm

Loading the nVidia drivers from the initramfs image. It worked, but I had to include a blacklist_nouveau.conf modprobe configuration file.

Now i can set high resolutions from grub that I couldn’t set before and have a nice resolution during boot. Before, the nouveau kernel module was being loaded and used in the initramfs image. Now, the nVidia module is. That was my goal. To fully use the nvidia driver, during boot, and after boot, which I have now accomplished.

From this site,


With the recent 364.x releases of the NVIDIA binary drivers, KMS (Kernel-Mode-Setting) support for Linux was added.

A couple things on that page are wrong, and I couldn’t find a place to leave comments to inform the gentleman who wrote it. But it got me pointed in the right direction. Bootup looks a lot nicer now than it did before.

My next step is to try and change the boot-up heme. I used plymouth-set-default-theme but it only shows during shutdown.

I’ll research more on how to change the boot-up theme.

Spork_Schivago · January 31, 2018, 1:00pm

Thank you. I had figured it out though. I also discovered the lsinitrd command but didn’t mention it.

At first, I had trouble with the skipcpio command. I was piping it to cpio and getting just garbage. So Instead, I ran it, redirected it to a file, ran file on it, discovered it was xz compressed, then pipped it to unxz, then cpio, and I was successfully able to extract the files.

wolfi323 · January 31, 2018, 1:16pm

Well, I suppose the nouveau module was not loaded, and that probably was the reason why you couldn’t use a higher resolution (nouveau should support that).

AFAIK, a blacklist (that you need anyway to use the nvidia driver) should be added to the initrd automatically (at least if you use mkinitrd to create it, which is normally used by “the system”), and that will force a generic framebuffer to be used (unless the nvidia driver is in the initrd of course).

I think if nouveau would have been loaded, it would have also prevented the nvidia driver from working.

But anyway…

My next step is to try and change the boot-up heme. I used plymouth-set-default-theme but it only shows during shutdown.

Did you use the “-R” (–rebuild-initrd) option?
Only this will add the theme’s files to the initrd, which is necessary for them to be accessible during boot (plymouth is started very early, from the initrd).

Spork_Schivago · January 31, 2018, 2:34pm

wolfi323:

Well, I suppose the nouveau module was not loaded, and that probably was the reason why you couldn’t use a higher resolution (nouveau should support that).

AFAIK, a blacklist (that you need anyway to use the nvidia driver) should be added to the initrd automatically (at least if you use mkinitrd to create it, which is normally used by “the system”), and that will force a generic framebuffer to be used (unless the nvidia driver is in the initrd of course).

I think if nouveau would have been loaded, it would have also prevented the nvidia driver from working.

But anyway…

Did you use the “-R” (–rebuild-initrd) option?
Only this will add the theme’s files to the initrd, which is necessary for them to be accessible during boot (plymouth is started very early, from the initrd).

nouveau was loaded, but it doesn’t work quite right with my card for some reason. It’s a bit of an older card, the nVidia GTX 670 FTW.

Yes, the nouveau was preventing the nVidia module from loading.

I did add the -R option to plymouth. Plymouth calls dracut, just like I call, to build the initrd file. Are you sure I should be using mkinitrd, instead of dracut? I was under the impression that dracut is what I’m supposed to be using. And plymouth-set-default-theme solar -R does in fact call dracut. I was looking at the source for it, but I checked anyways.

I noticed a few things since I can now boot into Linux with the nVidia drivers loaded. lsmod | grep nvidia shows that one module’s name is nvidia_drm. In my /etc/modprobe.conf.d/nvidia.conf file, I have just the following line:


options nvidia-drm modeset=1

I’m thinking I need to change that nvidia-drm to an nvidia_drm. Then perhaps my plymouth theme will load. From reading about tumbleweed and plymouth though, it seems other people have had this issue in the past, with just tumbleweed. I haven’t seen anything where it says it’s fixed though. When the system boots up, I see a green screen with three little bars. The first bar is lit up. Then further along the boot process, the second bar lights up and the first turns off. Then finally the third, then it’s all replaced with some text-mode [OK] messages, finally I get to gdm login screen.

Finally, nvidia_uvm doesn’t appear to get loaded unless it’s needed, according to the nVidia documentation and some googling. I would like it to load automatically though. From what I understand, I’m going to have to create some sort of init script that will load it automatically on boot though. Does that sound about right?

Spork_Schivago · January 31, 2018, 2:52pm

Too late to edit.

I believe changing the options nvidia_drm modeset=1 in the conf file was the proper way to go, because when I cat /sys/module/nvidia_drm/parameters/modeset, I see Y.

I didn’t check before I made the modifications, rebuilt the initrd using plymouth-set-default-theme solar -R.

It’s not a green screen I see during boot, but a gray screen, with three green bars that are a bit dark. Then as the boot process furthers along, one will light up, then the it’ll move to the second, then the third, like a progress indicator as to how far along I am in the boot process.

wolfi323 · January 31, 2018, 3:15pm

No. mkinitrd is just a script that calls dracut in the end, mainly for compatibility with older openSUSE versions I think.
But it determines the dracut options automatically, and might use some additional config files, /etc/sysconfig/kernel comes to mind.

In the end it shouldn’t matter much, as long as you pass/set the right options.

When the system boots up, I see a green screen with three little bars. The first bar is lit up. Then further along the boot process, the second bar lights up and the first turns off. Then finally the third

That’s plymouth’s text mode splash screen, that indicates that it cannot switch to graphics mode for some reason.

I do remember statements about plymouth not working at all with nvidia, the official nvidia rpm packages even had a package conflict with plymouth for some time and explicitly exclude it from the initrd now.
I suppose you installed the driver “the hard way”? (using the .run installer)

Finally, nvidia_uvm doesn’t appear to get loaded unless it’s needed, according to the nVidia documentation and some googling. I would like it to load automatically though. From what I understand, I’m going to have to create some sort of init script that will load it automatically on boot though. Does that sound about right?

There was an option in /etc/sysconfig/kernel for that.
According to dracut’s man page, the ‘–force-driver’ option should do that (according to its log, mkinitrd does use that option on my system for the kernel modules specified in /etc/sysconfig/kernel ).

xorbe · January 31, 2018, 5:08pm

Thank you for the detailed explanation! [only quoted a tiny bit for reference]

Spork_Schivago · February 1, 2018, 1:09am

I’ve just been passing -f to dracut. However, with my dracult .conf file, it adds the three drivers, and includes the modprobe.d .conf file which sets modeset to 1 for the nvidia_drm module. Plymouth has been like this for a long time, even with the nouveau module. In fact, I cannot remember it ever showing anything different, except for when I was running the non-rolling distro of OpenSuSE instead of Tumbleweed.

Yes, originally, I used the RPM package but desired a newer version. I followed the nvidia installation instructions aon nVidia’s site, and OpenSuSE’s how to install the hard way and did what those sites said. I installed DKMS so the driver will (hopefully) recompile when a newer kernel gets installed; I’ve read the nVidia modules depend on the full source, andit might not work (although I read this in an old post, so I’m hoping it no longer hold trues).

I could only find older statements on Plymouth not working with nVidia cards, but this is what I found about that:


...
In order for Plymouth to work though, we need DRM/KMS. The NVIDIA documentation notes that it can be enabled using the “modeset” option of the nvidia-drm kernel module.
...

Even with the nouveau, I get this result though, which I find odd. And it seems other distro’s don’t have this problem. From the posts I’ve read, plymouth was working fine in the non-rolling distribution of OpenSuSE…and people where asking why this wasn’t fixed in the rolling distro and what was causing it. Someone had suggested they thought plymouth was getting loaded to early or something. I want to say that post was from around 2012 though. I can’t imagine a problem like that would continue for this long. So the DRM / KMS stuff made since to me. And with the 364.x nVidia modules, we should now be able to use plymouth, at least that was my understanding.

–force-driver will load it in the initramfs, right? But once I’m booted, will it still stay loaded? You probably don’t know what the /etc/sysconfig/kernel file should look like, do you? I am currently looking for the file format but don’t see anything about persistent modules.

RedHad Portal mentions for persistent module loading, create a shell script (like the nvidia one that I thought was an init.d startup script) in the /etc/sysconfig/modules/ directory, with the extension as .module, and during startup, it’ll be treated as a startup script. If this process is the same with OpenSuSE, I should be able to just drop that script I found earlier into that directory, reboot, and be good to go for the nvidia_uvm module.

With the --force-driver option for dracut (I saw it in the manpage as well), I can add that to my dracut .conf file and see if it makes a difference, but the man page says, as I’m sure you’re aware:


...
       --force-drivers <list of kernel modules>
           See add-drivers above. But in this case it is ensured that the
           drivers are **tried** to be loaded early via modprobe.
...

I’ve read it’s nvidia.ko that will load nvidia_uvm.ko if it “thinks” it’s needed. Is there away to get to a boot shell with just the initramfs to see what modules have actually been loaded? Maybe pause the boot-up some how, check the modules, some various settings, continue the boot process when I’m done, and see if Plymouth loads properly?

Thank you for the help.

Spork_Schivago · February 1, 2018, 1:27am

This newsgroup(?) posting sheds some light on the problem…

http://opensuse.14.x6.nabble.com/Graphical-Plymouth-Themes-not-working-in-Tumbleweed-Text-Mode-Themes-works-fine-td5035966.html

Almost three years old now. But it gives me an idea what the /etc/sysconfig/kernel should look like, however, one person exclaims NOT!!! to do that.

Raymond says:

I’d think the dracut --force-drivers would do the same though, wouldn’t it? I don’t really want to use the -H option if I have to. I’ll try the equivalent of the --force-drivers option in my dracut config file and see if it makes any difference.

Spork_Schivago · February 1, 2018, 2:05am

I’ve been going over this:

https://doc.opensuse.org/documentation/leap/reference/html/book.opensuse.reference/cha.boot.html

Which describes the boot process and how bootloader loads the kernel and the initrd file into memory, starts the kernel, tells the kernel where the initrd file is located in memory, then the kernel extracts it, if it’s compressed, mounts it, then executes init (in my initrd file, it’s a symbolic link to systemd) in the initrd file.

If so, could I create a script in <initrd>:/usr/lib/systemd/scripts that calls bash or maybe a <initrd>:/usr/lib/systemd/system/pause_initrd.service file which in turn calls a script that calls bash and pauses the startup somehow would be the best way to go. Just not sure what service file I’d want to modify. I’d like it to be before plymouth is started, but after the modules have loaded. Maybe I could just modify the plymouth-start.service and either change ExecStart to my script or add my pause_initrd.service file to the Before section in the plymouth-start.service file…

I see that plymouth-start.service file has Type=Forking though, so it probably wouldn’t work. It’d just continue the boot process and fork a bash shell script, not really pause the boot process and let me see what’s going on.

arvidjaar · February 1, 2018, 4:32am

Would rd.break on kernel command suffice? man dracut.kernel.

arvidjaar · February 1, 2018, 4:38am

If you run a dracut -H … then dracut will recognize the video driver required and included it into the initrd file.

However as with many things, due to the sake of compatibility with other
programs the mkinitrd wrapper seems to be the culprit. I haven’t checked, but
I guess that it calls dracut in such a way that it no longer checks which
video driver is required.

I like “I have not checked this but I guess”. --host-only is default dracut setting on openSUSE, even if you call it directly, without using mkinitrd wrapper. mkinitrd has option to disable it but does follow default.

Spork_Schivago · February 1, 2018, 5:16am

Even with Tumbleweed, this doesn’t break stuff? I skimmed what he said and missed the part about him not checking this. So where does he get it from?

If it can break stuff though, perhaps this is the reason people are having trouble with Plymouth, nVidia cards, and OpenSuSE distributions…I’ll troubleshoot and try the rd.break that you were talking about.

I just want to get a shell prompt after drivers are loaded from initramfs, but before systemd in the initramfs starts Plymouth. Otherwise, it might be pointless, depending on where I stop. If drivers haven’t loaded yet and Plymouth hasn’t started, I cannot rule out that the nVidia drivers aren’t being loaded properly before Plymouth starts. Or I cannot check to see that the modeset = 1 was passed to the nvidia_drm module (that’s what should allow me to see a graphical Plymouth theme using the nVidia card, from my understanding).

Gonna try rd.break now. I’ll let you know how it worked out.

For what it’s worth, adding the force_drivers+= " <three nvidia modules> " to the /etc/dracut.conf.d/nvidia.conf file I created does in fact force nvidia_uvm to be loaded, so that’s one issue down. I owe thanks to wolfi323 for that suggestion.

arvidjaar · February 1, 2018, 5:43am

It is near to impossible. The very reason for the problem with Plymouth splash screen is that drivers are loaded asynchronously by udev and there is no explicit synchronization with Plymouth. Which means that when you get shell prompt drivers are almost certainly already present.

This is rather hard to debug race condition, as any change in environment would change the timing. E.g. I usually observed the problem first time I started QEMU with TW, and on subsequent launch splash screen would display normally. VM loads faster, different timing - works …

Spork_Schivago · February 2, 2018, 9:25pm

Yes, it is rather hard to debug.

I got a shell, but like you said, it changed the environment, but also, the drivers had already loaded, and Plymouth systemd service had been executed.

I was able to see what modules where loaded at least at the shell and create a list of them. The nVidia drivers where loaded. I compared it to the modules listed after the system started. The ACPI video module wasn’t loaded, but I don’t think this causes any issues. I believe that just gives the status of video devices. Would that not being loaded cause this issue?

Could you go a bit more in depth about your statement about udev and Plymouth splash screen?

Are you saying that’s the reason I get the text mode splash screen? Do you mean that Plymouth is being started by systemd before the nVidia drivers are loaded? If so, couldn’t I modify the plymouth-start.service file, create a script that manually calls modprobe nvidia (and the other nvidia drivers I need), and then start Plymouth?

I wrote a udev script that I use, so when I pop in a very specific thumb drive, it mounts just that thumb drive as read-only. I can just yank out the thumb drive and it’ll auto-lock the gnome desktop. But when it’s in, it prevents the gnome desktop from being locked. The udev script just executes a bash script I wrote that handles everything.

If I’m understanding the problem correctly, although this wouldn’t be a proper fix, it’d at least allow me to see the splash screen until something better comes along, wouldn’t it? Or am I totally misunderstanding the issue here?

By adding the force_drivers+= to my dracut nvidia.conf file, the nVidia drivers appear to get loaded earlier, but I noticed by reading the journal that they’re tainting the kernel because they’re not in the kernel subdirectory. When I was at the shell though, I saw they where loaded…dmeg shows:


    5.092835] nvidia: loading out-of-tree module taints kernel.
    5.107163] nvidia: module license 'NVIDIA' taints kernel.
    5.121416] Disabling lock debugging due to kernel taint

I think I should probably fix that before continuing any further. I have to figure out how. I ran the installer with no options. I see if I run it with the --advanced-options, there’s this parameter that can be passed to the installer:


  --kernel-install-path=KERNEL-INSTALL-PATH
      The directory in which the NVIDIA kernel module should be
      installed.  The default value is either
      '/lib/modules/`uname -r`/kernel/drivers/video' (if
      '/lib/modules/`uname -r`/kernel' exists) or
      '/lib/modules/`uname -r`/video'.

The default values should have been used, which would have installed the nVidia modules in the /lib/modules/uname -r/kernel/drivers/video directory, because the /lib/modules/uname -r/kernel directory exists. For whatever reason though, my nVidia modules where installed in the /lib/modules/4.14.15-1-default/updates directory…4.14.15-1-default is the kernel I’m running right now. I’m wondering if the nVidia installer looks at /lib/modules/uname -r/modules.dep at all to see if previous drivers where installed, and if so, installs them to the same directory? And maybe the repo-based package installed them to the …/updates directory?

I wonder if by first enabling the repo-based proprietary drivers, if it created some configuration file that it didn’t remove that the installer is reading, that installs them to the /lib/modules/uname -r/updates directory…

Thanks for helping.