Hi,
amd gpu driver won’t load at boot time - error is:
3.054338] [drm] amdgpu kernel modesetting enabled.
3.109109] amdgpu 0000:01:00.0: Direct firmware load for amdgpu/topaz_mc.bin failed with error -2
3.109110] cik_mc: Failed to load firmware "amdgpu/topaz_mc.bin"
3.109161] [drm:gmc_v7_0_sw_init [amdgpu]] *ERROR* Failed to load mc firmware!
3.109179] [drm:amdgpu_device_init [amdgpu]] *ERROR* sw_init of IP block <gmc_v7_0> failed -2
3.109180] amdgpu 0000:01:00.0: amdgpu_init failed
3.109181] amdgpu 0000:01:00.0: Fatal error during GPU init
3.109183] [drm] amdgpu: finishing device.
3.252124] amdgpu: probe of 0000:01:00.0 failed with error -2
however if after boot finished I do
rmmod amdgpu && insmod /lib/modules/4.14.11-2.gc36893f-default/kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko
driver is loaded successfully:
2474.117342] [drm] amdgpu kernel modesetting enabled.
2474.117378] vga_switcheroo: detected switching method \_SB_.PCI0.GFX0.ATPX handle
2474.117661] ATPX version 1, functions 0x00000033
2474.117866] ATPX Hybrid Graphics
2474.118424] [drm] initializing kernel modesetting (TOPAZ 0x1002:0x6900 0x1028:0x0767 0xC3).
2474.118569] [drm] register mmio base: 0xD0200000
2474.118571] [drm] register mmio size: 262144
2474.118597] [drm] probing gen 2 caps for device 8086:9d10 = 1724843/e
2474.118599] [drm] probing mlw for device 8086:9d10 = 1724843
2474.118629] vga_switcheroo: enabled
2474.150152] ATOM BIOS: BR46858.006
2474.150165] [drm] GPU post is not needed
2474.150166] [drm] Changing default dispclk from 0Mhz to 600Mhz
2474.150273] [drm] vm size is 64 GB, block size is 13-bit, fragment size is 4-bit
2474.151245] amdgpu 0000:01:00.0: VRAM: 4096M 0x000000F400000000 - 0x000000F4FFFFFFFF (4096M used)
2474.151246] amdgpu 0000:01:00.0: GTT: 256M 0x0000000000000000 - 0x000000000FFFFFFF
2474.151253] [drm] Detected VRAM RAM=4096M, BAR=256M
2474.151254] [drm] RAM width 64bits GDDR5
2474.151289] [TTM] Zone kernel: Available graphics memory: 8122176 kiB
2474.151290] [TTM] Zone dma32: Available graphics memory: 2097152 kiB
2474.151290] [TTM] Initializing pool allocator
2474.151292] [TTM] Initializing DMA pool allocator
2474.151351] [drm] amdgpu: 4096M of VRAM memory ready
2474.151352] [drm] amdgpu: 4096M of GTT memory ready.
2474.151360] [drm] GART: num cpu pages 65536, num gpu pages 65536
2474.152053] [drm] PCIE GART of 256M enabled (table at 0x000000F400040000).
2474.152083] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
2474.152084] [drm] Driver supports precise vblank timestamp query.
2474.152124] amdgpu 0000:01:00.0: amdgpu: using MSI.
2474.152138] [drm] amdgpu: irq initialized.
2474.395735] amdgpu: [powerplay] amdgpu: powerplay sw initialized
2474.399108] amdgpu 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000000400080, cpu addr 0xffffaf3e82135080
2474.399342] amdgpu 0000:01:00.0: fence driver on ring 1 use gpu addr 0x0000000000400100, cpu addr 0xffffaf3e82135100
2474.399572] amdgpu 0000:01:00.0: fence driver on ring 2 use gpu addr 0x0000000000400180, cpu addr 0xffffaf3e82135180
2474.399778] amdgpu 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000000400200, cpu addr 0xffffaf3e82135200
2474.399933] amdgpu 0000:01:00.0: fence driver on ring 4 use gpu addr 0x0000000000400280, cpu addr 0xffffaf3e82135280
2474.400119] amdgpu 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000400300, cpu addr 0xffffaf3e82135300
2474.400235] amdgpu 0000:01:00.0: fence driver on ring 6 use gpu addr 0x0000000000400380, cpu addr 0xffffaf3e82135380
2474.400327] amdgpu 0000:01:00.0: fence driver on ring 7 use gpu addr 0x0000000000400400, cpu addr 0xffffaf3e82135400
2474.400490] amdgpu 0000:01:00.0: fence driver on ring 8 use gpu addr 0x0000000000400480, cpu addr 0xffffaf3e82135480
2474.400548] amdgpu 0000:01:00.0: fence driver on ring 9 use gpu addr 0x0000000000400520, cpu addr 0xffffaf3e82135520
2474.402510] amdgpu 0000:01:00.0: fence driver on ring 10 use gpu addr 0x00000000004005a0, cpu addr 0xffffaf3e821355a0
2474.402670] amdgpu 0000:01:00.0: fence driver on ring 11 use gpu addr 0x0000000000400620, cpu addr 0xffffaf3e82135620
2474.421145] amdgpu: [powerplay] can't get the mac of 5
2474.426791] [drm] ring test on 0 succeeded in 18 usecs
2474.427673] [drm] ring test on 9 succeeded in 8 usecs
2474.427681] [drm] ring test on 1 succeeded in 2 usecs
2474.427736] [drm] ring test on 2 succeeded in 22 usecs
2474.427761] [drm] ring test on 3 succeeded in 10 usecs
2474.427790] [drm] ring test on 4 succeeded in 11 usecs
2474.427818] [drm] ring test on 5 succeeded in 11 usecs
2474.427846] [drm] ring test on 6 succeeded in 11 usecs
2474.427873] [drm] ring test on 7 succeeded in 10 usecs
2474.427901] [drm] ring test on 8 succeeded in 11 usecs
2474.427947] [drm] ring test on 10 succeeded in 7 usecs
2474.427954] [drm] ring test on 11 succeeded in 6 usecs
2474.428167] [drm] ib test on ring 0 succeeded
2474.428327] [drm] ib test on ring 1 succeeded
2474.428452] [drm] ib test on ring 2 succeeded
2474.428487] [drm] ib test on ring 3 succeeded
2474.428533] [drm] ib test on ring 4 succeeded
2474.428579] [drm] ib test on ring 5 succeeded
2474.428626] [drm] ib test on ring 6 succeeded
2474.428669] [drm] ib test on ring 7 succeeded
2474.428715] [drm] ib test on ring 8 succeeded
2474.428736] [drm] ib test on ring 9 succeeded
2474.428759] [drm] ib test on ring 10 succeeded
2474.428780] [drm] ib test on ring 11 succeeded
2474.430380] amdgpu 0000:01:00.0: kfd not supported on this ASIC
2474.430383] [drm] Initialized amdgpu 3.19.0 20150101 for 0000:01:00.0 on minor 0
2479.826611] amdgpu: [powerplay] VI should always have 2 performance levels
2479.879729] amdgpu 0000:01:00.0: GPU pci config reset
I am running Kernel 4.14.11-2.gc36893f-default
Any ideas how to make amdgpu load at boot time?
I suppose the firmware is missing in the initrd (on boot the driver is probably loaded before the root filesystem is mounted).
Try running “sudo mkinitrd” and see if it helps.
Or maybe try to disable the boot splash with the “plymouth.enable=0” boot option, the graphics driver (amdgpu) should be loaded later then when hopefully / is already mounted/available.
Plymouth does not load anything - driver is loaded by udev in response to hardware enumeration. It is possible that not including plymouth in initrd will also skip adding GPU drivers, but once driver is there, it is too late.
thanks for the hint with mkinitrd - it is throwing errors:
dracut: *** Including module: drm ***
dracut: Possible missing firmware "i915/bxt_dmc_ver1_07.bin" for kernel module "i915.ko"
dracut: Possible missing firmware "i915/skl_dmc_ver1_26.bin" for kernel module "i915.ko"
dracut: Possible missing firmware "i915/kbl_dmc_ver1_01.bin" for kernel module "i915.ko"
dracut: Possible missing firmware "i915/kbl_guc_ver9_14.bin" for kernel module "i915.ko"
dracut: Possible missing firmware "i915/bxt_guc_ver8_7.bin" for kernel module "i915.ko"
dracut: Possible missing firmware "i915/skl_guc_ver6_1.bin" for kernel module "i915.ko"
dracut: Possible missing firmware "amdgpu/polaris11_smc_sk.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris11_smc.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris10_smc_sk.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris10_smc.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/fiji_smc.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/tonga_k_smc.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/tonga_smc.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/topaz_k_smc.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/topaz_smc.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/topaz_mc.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "radeon/hawaii_mc.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "radeon/bonaire_mc.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris10_mc.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris11_mc.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/tonga_mc.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris10_rlc.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris10_mec2.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris10_mec.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris10_me.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris10_pfp.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris10_ce.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris11_rlc.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris11_mec2.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris11_mec.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris11_me.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris11_pfp.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris11_ce.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/fiji_rlc.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/fiji_mec2.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/fiji_mec.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/fiji_me.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/fiji_pfp.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/fiji_ce.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/topaz_rlc.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/topaz_mec.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/topaz_me.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/topaz_pfp.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/topaz_ce.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/tonga_rlc.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/tonga_mec2.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/tonga_mec.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/tonga_me.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/tonga_pfp.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/tonga_ce.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/stoney_rlc.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/stoney_mec.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/stoney_me.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/stoney_pfp.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/stoney_ce.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/carrizo_rlc.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/carrizo_mec2.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/carrizo_mec.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/carrizo_me.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/carrizo_pfp.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/carrizo_ce.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/topaz_sdma1.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/topaz_sdma.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris11_sdma1.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris11_sdma.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris10_sdma1.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris10_sdma.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/stoney_sdma.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/fiji_sdma1.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/fiji_sdma.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/carrizo_sdma1.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/carrizo_sdma.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/tonga_sdma1.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/tonga_sdma.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris11_uvd.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris10_uvd.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/stoney_uvd.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/fiji_uvd.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/carrizo_uvd.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/tonga_uvd.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris11_vce.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/polaris10_vce.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/stoney_vce.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/fiji_vce.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/carrizo_vce.bin" for kernel module "amdgpu.ko"
dracut: Possible missing firmware "amdgpu/tonga_vce.bin" for kernel module "amdgpu.ko"
I checked /var/log/YaST2/mkinitrd.log and discovered that this error is happening since Kernel 4.14.8-1.g674981b-default.
mkinitrd creates 3 initrds (all of them complaining about missing firmware)
Creating initrd: /boot/initrd-4.14.11-1.g58fec0f-default
/boot/initrd-4.14.11-2.gc36893f-default
/boot/initrd-4.4.103-36-default
I removed the 2 older versions (via Yast) - still the same error from initrd.
The files obviously exist in /lib/firmware (e.g. the one I need):
-rw-r--r-- 1 root root 32100 Jan 4 16:06 /lib/firmware/amdgpu/topaz_mc.bin
I found the following files in /etc/dracut.conf.d/
-rw-r--r-- 1 root root 100 Jan 1 22:44 amdgpu-4.14.0-rc4-1.g879f297-default.conf
-rw-r--r-- 1 root root 96 Dec 20 23:22 amdgpu-4.14.8-1.g674981b-default.conf
-rw-r--r-- 1 root root 100 Oct 13 17:57 amdgpu-pro-4.14.0-rc4-1.g879f297-default.conf
removed them now mkinitrd builds without errors.
after reboot amd driver loads at boot time:
3.357611] [drm] amdgpu kernel modesetting enabled.
3.370320] AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>
3.370323] AMD IOMMUv2 functionality not available on this system
3.426224] amdgpu 0000:01:00.0: VRAM: 4096M 0x000000F400000000 - 0x000000F4FFFFFFFF (4096M used)
3.426225] amdgpu 0000:01:00.0: GTT: 256M 0x0000000000000000 - 0x000000000FFFFFFF
3.430326] [drm] amdgpu: 4096M of VRAM memory ready
3.430327] [drm] amdgpu: 4096M of GTT memory ready.
3.431273] amdgpu 0000:01:00.0: amdgpu: using MSI.
3.431290] [drm] amdgpu: irq initialized.
3.675876] amdgpu: [powerplay] amdgpu: powerplay sw initialized
3.676101] amdgpu 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000000400080, cpu addr 0xffffbf40c2009080
3.676139] amdgpu 0000:01:00.0: fence driver on ring 1 use gpu addr 0x0000000000400100, cpu addr 0xffffbf40c2009100
3.676207] amdgpu 0000:01:00.0: fence driver on ring 2 use gpu addr 0x0000000000400180, cpu addr 0xffffbf40c2009180
3.676238] amdgpu 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000000400200, cpu addr 0xffffbf40c2009200
3.676258] amdgpu 0000:01:00.0: fence driver on ring 4 use gpu addr 0x0000000000400280, cpu addr 0xffffbf40c2009280
3.676279] amdgpu 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000400300, cpu addr 0xffffbf40c2009300
3.676298] amdgpu 0000:01:00.0: fence driver on ring 6 use gpu addr 0x0000000000400380, cpu addr 0xffffbf40c2009380
3.676319] amdgpu 0000:01:00.0: fence driver on ring 7 use gpu addr 0x0000000000400400, cpu addr 0xffffbf40c2009400
3.676341] amdgpu 0000:01:00.0: fence driver on ring 8 use gpu addr 0x0000000000400480, cpu addr 0xffffbf40c2009480
3.676358] amdgpu 0000:01:00.0: fence driver on ring 9 use gpu addr 0x0000000000400520, cpu addr 0xffffbf40c2009520
3.676698] amdgpu 0000:01:00.0: fence driver on ring 10 use gpu addr 0x00000000004005a0, cpu addr 0xffffbf40c20095a0
3.676726] amdgpu 0000:01:00.0: fence driver on ring 11 use gpu addr 0x0000000000400620, cpu addr 0xffffbf40c2009620
3.688914] amdgpu: [powerplay] can't get the mac of 5
3.698751] amdgpu 0000:01:00.0: kfd not supported on this ASIC
3.698755] [drm] Initialized amdgpu 3.19.0 20150101 for 0000:01:00.0 on minor 1
10.976204] amdgpu: [powerplay] VI should always have 2 performance levels
11.021611] amdgpu 0000:01:00.0: GPU pci config reset
37.261153] amdgpu: [powerplay] can't get the mac of 5
43.815630] amdgpu: [powerplay] VI should always have 2 performance levels
43.878213] amdgpu 0000:01:00.0: GPU pci config reset