radeon segfaults on startup (R5 M230)

I have both Intel HD graphics and discrete Radeon GPU in my laptop, but I can’t use the latter. Xorg log ends with following:

   448.629] (II) Loading sub module "glamoregl"
   448.629] (II) LoadModule: "glamoregl"
   448.629] (II) Loading /usr/lib64/xorg/modules/libglamoregl.so
   448.637] (II) Module glamoregl: vendor="X.Org Foundation"
   448.637]    compiled for 1.20.3, module version = 1.0.1
   448.637]    ABI class: X.Org ANSI C Emulation, version 0.4
   448.697] (EE) modeset(G0): glamor: Failed to create GL or GLES2 contexts
   448.697] (EE)  
   448.697] (EE) Backtrace:
   448.699] (EE) 0: /usr/bin/X (xorg_backtrace+0x79) [0x56284fdd9759]
   448.699] (EE) 1: /usr/bin/X (0x56284fc27000+0x1b6429) [0x56284fddd429]
   448.699] (EE) 2: /lib64/libpthread.so.0 (0x7f8097f75000+0x11f70) [0x7f8097f86f70]
   448.699] (EE) 3: /usr/lib64/dri/radeonsi_dri.so (0x7f808e126000+0x74ec86) [0x7f808e874c86]
   448.699] (EE) 4: /usr/lib64/dri/radeonsi_dri.so (0x7f808e126000+0x42da31) [0x7f808e553a31]
   448.699] (EE) 5: /usr/lib64/dri/radeonsi_dri.so (0x7f808e126000+0x42da55) [0x7f808e553a55]
   448.699] (EE) 6: /usr/lib64/dri/radeonsi_dri.so (0x7f808e126000+0x42b22f) [0x7f808e55122f]
   448.699] (EE) 7: /usr/lib64/libgbm.so.1 (0x7f8094005000+0x4750) [0x7f8094009750]
   448.699] (EE) 8: /usr/lib64/xorg/modules/libglamoregl.so (0x7f808efb0000+0x7f63) [0x7f808efb7f63]
   448.699] (EE) 9: /usr/lib64/xorg/modules/libglamoregl.so (glamor_egl_init+0x1a5) [0x7f808efb9385]
   448.699] (EE) 10: /usr/lib64/xorg/modules/drivers/modesetting_drv.so (0x7f809650a000+0xc1e2) [0x7f80965161e2]
   448.700] (EE) 11: /usr/bin/X (0x56284fc27000+0xb6079) [0x56284fcdd079]
   448.700] (EE) 12: /usr/bin/X (0x56284fc27000+0xbba81) [0x56284fce2a81]
   448.700] (EE) 13: /usr/bin/X (0x56284fc27000+0xb8309) [0x56284fcdf309]
   448.700] (EE) 14: /usr/bin/X (0x56284fc27000+0xb895b) [0x56284fcdf95b]
   448.700] (EE) 15: /usr/bin/X (config_init+0x9) [0x56284fcde369]
   448.700] (EE) 16: /usr/bin/X (InitInput+0xc7) [0x56284fcc1eb7]
   448.700] (EE) 17: /usr/bin/X (0x56284fc27000+0x5f4a3) [0x56284fc864a3]
   448.700] (EE) 18: /lib64/libc.so.6 (__libc_start_main+0xeb) [0x7f8097bd7feb]
   448.700] (EE) 19: /usr/bin/X (_start+0x2a) [0x56284fc7022a]
   448.700] (EE)  
   448.700] (EE) Segmentation fault at address 0x28
   448.700] (EE)  
Fatal server error:
   448.700] (EE) Caught signal 11 (Segmentation fault). Server aborting
   448.700] (EE)  
   448.700] (EE)  
Please consult the The X.Org Foundation support  
         at http://wiki.x.org
 for help.  
   448.700] (EE) Please also check the log file at "/var/log/Xorg.1.log" for additional information.
   448.700] (EE)  
   448.700] (II) AIGLX: Suspending AIGLX clients for VT switch
   448.765] (EE) Server terminated with error (1). Closing log file.

leaving me to boot with only integrated graphics (which is less than ideal).

cat /sys/kernel/debug/vgaswitcheroo/switch

indicates that both are powered on, but when I try to switch to the discrete one, I’m met with the blank screen after restarting X. Is this a bug on someone’s side, simple lack of support or an error on my side?

Hi and welcome to the forum :slight_smile:
So can you post the output from;


/sbin/lspci -nnk | egrep -A3 "VGA|Display|3D"

Also can you confirm the xf86-video-amdgpu and kernel-firmware packages are installed?


zypper se -i xf86-video-amdgpu kernel-firmware 

For your card, it’s a SI card I think (or CIK card?), therefore some kernel boot options need setting, but wil confirm with above output.


amdgpu.si_support=1 radeon.si_support=0

You might also need amdgpu.dc=0 (or even 1).

I hope you don’t mind me starting from the middle. It turned out I didn’t have xf86-video-amdgpu installed (I could have removed it while trying to debug this earlier), so I installed that package. Next, I figured it was CIK because it’s listed with Rx 200 series cards on Wikipedia, which turned out to be wrong, because radeon module got loaded despite setting radeon.cik_support=0 amdgpu.cik_support=1. However, this time when launching glxgears with DRI_PRIME=1 I got a different error. Relevant logs below (it didn’t update Xorg.1.log):


luka@vostro:~$ /sbin/lspci -nnk | egrep -A3 "VGA|Display|3D"
00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics 620 [8086:5916] (rev 02)
        DeviceName:  Onboard IGD
        Subsystem: Dell Device [1028:0794]
        Kernel driver in use: i915
--
01:00.0 Display controller [0380]: Advanced Micro Devices, Inc. [AMD/ATI] Sun LE [Radeon HD 8550M / R5 M230] [1002:666f]
        Subsystem: Dell Device [1028:0794]
        Kernel driver in use: radeon
        Kernel modules: radeon, amdgpu
luka@vostro:~$  DRI_PRIME=1 glxgears
radeon: Failed to allocate virtual address for buffer:
radeon:    size      : 65536 bytes
radeon:    alignment : 4096 bytes
radeon:    domains   : 4
radeon:    va        : 0x0000000100000000
radeon: Failed to deallocate virtual address for buffer:
radeon:    size      : 65536 bytes
radeon:    va        : 0x100000000
radeon: Failed to allocate virtual address for buffer:
radeon:    size      : 65536 bytes
radeon:    alignment : 4096 bytes
radeon:    domains   : 4
radeon:    va        : 0x0000000100000000
radeon: Failed to deallocate virtual address for buffer:
radeon:    size      : 65536 bytes
radeon:    va        : 0x100000000
radeonsi: Failed to create a context.
radeon: Failed to allocate virtual address for buffer:
radeon:    size      : 65536 bytes
radeon:    alignment : 4096 bytes
radeon:    domains   : 4
radeon:    va        : 0x0000000100000000
radeon: Failed to deallocate virtual address for buffer:
radeon:    size      : 65536 bytes
radeon:    va        : 0x100000000
radeon: Failed to allocate virtual address for buffer:
radeon:    size      : 65536 bytes
radeon:    alignment : 4096 bytes
radeon:    domains   : 4
radeon:    va        : 0x0000000100000000
radeon: Failed to deallocate virtual address for buffer:
radeon:    size      : 65536 bytes
radeon:    va        : 0x100000000
radeonsi: Failed to create a context.
X Error of failed request:  BadValue (integer parameter out of range for operation)
  Major opcode of failed request:  151 (GLX)
  Minor opcode of failed request:  3 (X_GLXCreateContext)
  Value in failed request:  0x0
  Serial number of failed request:  33
  Current serial number in output stream:  35
luka@vostro:~$ dmesg | egrep 'amdgpu|radeon'
    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.19.5-1-default root=UUID=444127e6-4233-42a4-972c-c66b7c6177cc resume=/dev/disk/by-uuid/40d2c368-5373-425e-ae9a-b5f1763efb05 splash=silent quiet showopts radeon.cik_support=0 amdgpu.cik_support=1
    0.102645] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-4.19.5-1-default root=UUID=444127e6-4233-42a4-972c-c66b7c6177cc resume=/dev/disk/by-uuid/40d2c368-5373-425e-ae9a-b5f1763efb05 splash=silent quiet showopts radeon.cik_support=0 amdgpu.cik_support=1
    3.354945] [drm] radeon kernel modesetting enabled.
    3.355296] radeon 0000:01:00.0: enabling device (0000 -> 0003)
    3.400074] radeon 0000:01:00.0: VRAM: 2048M 0x0000000000000000 - 0x000000007FFFFFFF (2048M used)
    3.400076] radeon 0000:01:00.0: GTT: 2048M 0x0000000080000000 - 0x00000000FFFFFFFF
    3.400221] [drm] radeon: 2048M of VRAM memory ready
    3.400222] [drm] radeon: 2048M of GTT memory ready.
    3.409905] [drm] radeon: dpm initialized
    3.417891] radeon 0000:01:00.0: WB enabled
    3.417893] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000080000c00 and cpu addr 0x00000000b1bb0248
    3.417895] radeon 0000:01:00.0: fence driver on ring 1 use gpu addr 0x0000000080000c04 and cpu addr 0x0000000089edaf71
    3.417896] radeon 0000:01:00.0: fence driver on ring 2 use gpu addr 0x0000000080000c08 and cpu addr 0x000000000d2c14f2
    3.417897] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000080000c0c and cpu addr 0x000000000dcd147a
    3.417898] radeon 0000:01:00.0: fence driver on ring 4 use gpu addr 0x0000000080000c10 and cpu addr 0x0000000080ce9bf1
    3.417901] radeon 0000:01:00.0: radeon: MSI limited to 32-bit
    3.417944] radeon 0000:01:00.0: radeon: using MSI.
    3.417961] [drm] radeon: irq initialized.
    3.636147] [drm] Initialized radeon 2.50.0 20080528 for 0000:01:00.0 on minor 1
    3.699829] [drm] amdgpu kernel modesetting enabled.
   44.710324] radeon 0000:01:00.0: WB enabled
   44.710326] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000080000c00 and cpu addr 0x00000000b1bb0248
   44.710328] radeon 0000:01:00.0: fence driver on ring 1 use gpu addr 0x0000000080000c04 and cpu addr 0x0000000089edaf71
   44.710329] radeon 0000:01:00.0: fence driver on ring 2 use gpu addr 0x0000000080000c08 and cpu addr 0x000000000d2c14f2
   44.710330] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000080000c0c and cpu addr 0x000000000dcd147a
   44.710331] radeon 0000:01:00.0: fence driver on ring 4 use gpu addr 0x0000000080000c10 and cpu addr 0x0000000080ce9bf1
   45.344288] [drm:r600_ring_test [radeon]] *ERROR* radeon: ring 0 test failed (scratch(0x850C)=0xCAFEDEAD)
   45.344304] [drm:si_resume [radeon]] *ERROR* si startup failed on resume
   84.041446] [drm:atom_op_jump [radeon]] *ERROR* atombios stuck in loop for more than 5secs aborting
   84.041458] [drm:atom_execute_table_locked [radeon]] *ERROR* atombios stuck executing 6A9E (len 254, WS 0, PS 4) @ 0x6AAC
   84.041466] [drm:atom_execute_table_locked [radeon]] *ERROR* atombios stuck executing 6336 (len 78, WS 12, PS 8) @ 0x636F
   84.460251] radeon 0000:01:00.0: Wait for MC idle timedout !
   84.666244] radeon 0000:01:00.0: Wait for MC idle timedout !
   84.675780] radeon 0000:01:00.0: WB enabled
   84.675782] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000080000c00 and cpu addr 0x00000000b1bb0248
   84.675783] radeon 0000:01:00.0: fence driver on ring 1 use gpu addr 0x0000000080000c04 and cpu addr 0x0000000089edaf71
   84.675785] radeon 0000:01:00.0: fence driver on ring 2 use gpu addr 0x0000000080000c08 and cpu addr 0x000000000d2c14f2
   84.675786] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000080000c0c and cpu addr 0x000000000dcd147a
   84.675787] radeon 0000:01:00.0: fence driver on ring 4 use gpu addr 0x0000000080000c10 and cpu addr 0x0000000080ce9bf1
   85.308452] [drm:r600_ring_test [radeon]] *ERROR* radeon: ring 0 test failed (scratch(0x850C)=0xCAFEDEAD)
   85.308467] [drm:si_resume [radeon]] *ERROR* si startup failed on resume

When setting radeon.si_support=0 amdgpu.si_support=1, I got segfault again, but I was using amdgpu. Setting amdgpu.dc to either 0 or 1 didn’t seem to be making a difference. This did update the Xorg.1.log:


luka@vostro:~$ /sbin/lspci -nnk | egrep -A3 "VGA|Display|3D"
00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics 620 [8086:5916] (rev 02)
        DeviceName:  Onboard IGD
        Subsystem: Dell Device [1028:0794]
        Kernel driver in use: i915
--
01:00.0 Display controller [0380]: Advanced Micro Devices, Inc. [AMD/ATI] Sun LE [Radeon HD 8550M / R5 M230] [1002:666f]
        Subsystem: Dell Device [1028:0794]
        Kernel driver in use: amdgpu
        Kernel modules: radeon, amdgpu
luka@vostro:~$ DRI_PRIME=1 glxgears
Running synchronized to the vertical refresh.  The framerate should be
approximately the same as the monitor refresh rate.
Segmentation fault (core dumped)
luka@vostro:~$ dmesg | egrep 'amdgpu|radeon'
    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.19.5-1-default root=UUID=444127e6-4233-42a4-972c-c66b7c6177cc resume=/dev/disk/by-uuid/40d2c368-5373-425e-ae9a-b5f1763efb05 splash=silent quiet showopts radeon.si_support=0 amdgpu.si_support=1 amdgpu.dc=0
    0.102889] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-4.19.5-1-default root=UUID=444127e6-4233-42a4-972c-c66b7c6177cc resume=/dev/disk/by-uuid/40d2c368-5373-425e-ae9a-b5f1763efb05 splash=silent quiet showopts radeon.si_support=0 amdgpu.si_support=1 amdgpu.dc=0
    3.779872] [drm] radeon kernel modesetting enabled.
    3.780377] radeon 0000:01:00.0: enabling device (0000 -> 0003)
    3.780601] radeon 0000:01:00.0: SI support disabled by module param
    3.875443] [drm] amdgpu kernel modesetting enabled.
    4.126438] amdgpu 0000:01:00.0: kfd not supported on this ASIC
    4.180337] amdgpu 0000:01:00.0: VRAM: 2048M 0x000000F400000000 - 0x000000F47FFFFFFF (2048M used)
    4.180338] amdgpu 0000:01:00.0: GART: 256M 0x0000000000000000 - 0x000000000FFFFFFF
    4.180518] [drm] amdgpu: 2048M of VRAM memory ready
    4.180519] [drm] amdgpu: 2856M of GTT memory ready.
    4.181285] amdgpu 0000:01:00.0: PCIE GART of 256M enabled (table at 0x000000F400000000).
    4.181464] [drm] amdgpu: dpm initialized
    4.827827] [drm] Initialized amdgpu 3.27.0 20150101 for 0000:01:00.0 on minor 1
   52.022364] amdgpu 0000:01:00.0: PCIE GART of 256M enabled (table at 0x000000F400000000).
   52.958217] glxgears[1744]: segfault at 1a ip 00007f09104a49fd sp 00007fff9c43b260 error 4 in radeonsi_dri.so[7f09102db000+7e8000]
luka@vostro:/var/log$ tail -n 42 Xorg.1.log
   162.886] (II) LoadModule: "glamoregl"
   162.886] (II) Loading /usr/lib64/xorg/modules/libglamoregl.so
   162.893] (II) Module glamoregl: vendor="X.Org Foundation"
   162.893]    compiled for 1.20.3, module version = 1.0.1
   162.893]    ABI class: X.Org ANSI C Emulation, version 0.4
   162.950] (EE) modeset(G0): glamor: Failed to create GL or GLES2 contexts
   162.950] (EE) 
   162.950] (EE) Backtrace:
   162.954] (EE) 0: /usr/bin/X (xorg_backtrace+0x79) [0x55d5a04ed759]
   162.954] (EE) 1: /usr/bin/X (0x55d5a033b000+0x1b6429) [0x55d5a04f1429]
   162.954] (EE) 2: /lib64/libpthread.so.0 (0x7f74973d5000+0x11f70) [0x7f74973e6f70]
   162.954] (EE) 3: /usr/lib64/dri/radeonsi_dri.so (0x7f7491582000+0x74ec86) [0x7f7491cd0c86]
   162.954] (EE) 4: /usr/lib64/dri/radeonsi_dri.so (0x7f7491582000+0x42da31) [0x7f74919afa31]
   162.954] (EE) 5: /usr/lib64/dri/radeonsi_dri.so (0x7f7491582000+0x42da55) [0x7f74919afa55]
   162.954] (EE) 6: /usr/lib64/dri/radeonsi_dri.so (0x7f7491582000+0x42b22f) [0x7f74919ad22f]
   162.954] (EE) 7: /usr/lib64/libgbm.so.1 (0x7f749240c000+0x4750) [0x7f7492410750]
   162.954] (EE) 8: /usr/lib64/xorg/modules/libglamoregl.so (0x7f749241c000+0x7f63) [0x7f7492423f63]
   162.954] (EE) 9: /usr/lib64/xorg/modules/libglamoregl.so (glamor_egl_init+0x1a5) [0x7f7492425385]
   162.954] (EE) 10: /usr/lib64/xorg/modules/drivers/modesetting_drv.so (0x7f7495960000+0xc1e2) [0x7f749596c1e2]
   162.954] (EE) 11: /usr/bin/X (0x55d5a033b000+0xb6079) [0x55d5a03f1079]
   162.954] (EE) 12: /usr/bin/X (0x55d5a033b000+0xbba81) [0x55d5a03f6a81]
   162.954] (EE) 13: /usr/bin/X (0x55d5a033b000+0xb8309) [0x55d5a03f3309]
   162.954] (EE) 14: /usr/bin/X (0x55d5a033b000+0xb895b) [0x55d5a03f395b]
   162.955] (EE) 15: /usr/bin/X (config_init+0x9) [0x55d5a03f2369]
   162.955] (EE) 16: /usr/bin/X (InitInput+0xc7) [0x55d5a03d5eb7]
   162.955] (EE) 17: /usr/bin/X (0x55d5a033b000+0x5f4a3) [0x55d5a039a4a3]
   162.955] (EE) 18: /lib64/libc.so.6 (__libc_start_main+0xeb) [0x7f7497037feb]
   162.955] (EE) 19: /usr/bin/X (_start+0x2a) [0x55d5a038422a]
   162.955] (EE) 
   162.955] (EE) Segmentation fault at address 0x28
   162.955] (EE) 
Fatal server error:
   162.955] (EE) Caught signal 11 (Segmentation fault). Server aborting
   162.955] (EE) 
   162.955] (EE) 
Please consult the The X.Org Foundation support 
         at http://wiki.x.org
 for help. 
   162.955] (EE) Please also check the log file at "/var/log/Xorg.1.log" for additional information.
   162.955] (EE) 
   162.955] (II) AIGLX: Suspending AIGLX clients for VT switch
   163.012] (EE) Server terminated with error (1). Closing log file.

I can’t pinpoint to what exactly is going wrong in either case.

Hi
Looks like it’s a matter of the correct combination…

So try them all, eg;


radeon.cik_support=0 radeon.si_support=1 amdgpu.cik_support=0 amdgpu.si_support=1

Can you also cat the output from swithcheroo (it didn’t appear in your first post).

Your xorg log should be down in ~/.local/share/xorg/?

Apologies for late reply – stuff got in the way.

Only change with different combinations is which driver gets loaded (radeon or amdgpu), and the error message which is displayed in regards to that (segfault for amdgpu and failure to allocate buffer for radeon). Also, when both are enabled for SI, it seems that radeon takes precedence.

Here’s the switcheroo output:

**vostro:/sys/kernel/debug/vgaswitcheroo #** cat switch  
0:IGD:+:Pwr:0000:00:02.0
1:DIS: :DynOff:0000:01:00.0

After I run a command with DRI_PRIME=1, I get a brief moment of DynPwr, and then it’s back to DynOff. If radeon.runpm is set to 0, DIS always has status Pwr, but the errors persists, and it’s never the active GPU.

I actually do have Xorg log in ~/.local/share/xorg, but it hasn’t been updated recently, I suspect that it’s a leftover from some older distro I had installed. Logs in /var/log are more or less current (Xorg in particular seems not to be flushing as often).

Hi
So if you set to amdgpu only with SI support (Since it’s a GCN 1.0 card);


radeon.si_support=0 amdgpu.si_support=1 exp_hw_support=1 amdgpu.dc=0 amdgpu.dpm=0

Also try without exp_hw_support and also with amdgpu.dc=1

If you also start the switheroo service, does the output change?


systemctl start switcheroo-control.service
systemctl status switcheroo-control.service

I’ve tried without amdgpu.dc, with it set to 1 and with it set to 0, and with and without exp_hw_support to no avail. I keep hitting the same wall. Starting the switcheroo service didn’t help – all the outputs are the same.

I do have multiple distros with various (older) kernel versions laying around, so I’ll try finding out if some of them works tomorrow.

Hi
You might have to stick with radeon :frowning: and try radeon.dpm=0.

I don’t mind either radeon or amdgpu, as long as it works, which so far hasn’t been the case. None of the power management options (including .dpm) don’t seem to be having any effect. Curiously enough, glxinfo works on amdgpu (i.e. DRI_PRIME=1 glxinfo runs successfully, unlike e.g. glxgears which segfaults and rstudio which doesn’t render), but not on radeon.

Hi
So you at the latest kernel now, 4.19.7 (there were a few amdgpu fixes)? Your sure it’s running Xorg and not Wayland (cat $XDG_SESSION_TYPE)?

It might be time to look at a bug report…

I’m running latest kernel (updated last night), and echo $XDG_SESSION_TYPE prints x11, so all assumptions check out. I’ll spend some time digging around to see if I might have missed something and experimenting some more, and if nothing works I’ll file a report. Thank you anyway!