Failed VCE resume

Hi, all -

I have a Dell Latitude E5570 running OpenSuse 15.3.
It has dedicated graphics (AMD Radeon R7) as well as Intel integrated graphics.
Every time I boot, I get the following message:

   4.604233] radeon 0000:01:00.0: failed VCE resume (-110).

I’ve looked over a bunch of forums, but I haven’t found a good solution. I tried the proprietary AMD driver, but it didn’t solve the issue. One “solution” left me with no gui and I had to go in on the CLI and edit a config file to restore functionality. Often, the issue is just left as “This is a known error.” Some people say to blacklist the radeon driver, but I suspect that this “fixes” the problem simply by getting rid of the error message. Anyway, hopefully that gives a general idea of what I’ve tried.

I’m pretty sure that all this means that my dedicated graphics are non-functional on OpenSuse…

Any ideas what I can do to get them running and get rid of this error??

Thanks much,

Philip

Hi
Switch to the amdgpu driver;


zypper in xf86-video-amdgpu

Create a blacklist file for radeon;


cat /etc/modprobe.d/50-radeon.conf 

blacklist radeon

Run mkinitrd to add the blacklist


mkinitrd

The fire up YaST Bootloader and add the following kernel options;


amdgpu.si=0 amdgpu.cik=1

This will blacklist the radeon driver and move to the amdgpu one. (I have a mullins R3 with Leap 15.3)


 /sbin/lspci -nnk | egrep -A3 "VGA|Display|3D"
00:01.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Mullins [Radeon R3 Graphics] [1002:9850] (rev 40)
	Subsystem: Hewlett-Packard Company Device [103c:8305]
	Kernel driver in use: amdgpu
	Kernel modules: radeon, amdgpu

Thanks for the super fast reply, Malcolm!

I did all these steps, and it got rid of the error message, but when I run this command to see what my active GPU is:

lspci -vnnn | perl -lne 'print if /^\d+\:.+(\\S+\:\S+\])/' | grep VGA

From this forum:

https://unix.stackexchange.com/questions/16407/how-to-check-which-gpu-is-active-in-linux

I get this result:

00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics 530 [8086:191b] (rev 06) (prog-if 00 [VGA controller])

Thoughts on this? What am I missing?

Thanks much,
Philip

Hi
Can you run the commands;


xrandr --listproviders

/sbin/lspci -nnk | egrep -A3 "VGA|Display|3D"

Duly run, here are the results.

xrandr --listproviders
Invalid MIT-MAGIC-COOKIE-1 keyProviders: number : 1
Provider 0: id: 0x49; cap: 0xf (Source Output, Sink Output, Source Offload, Sink Offload); crtcs: 3; outputs: 7; associated providers: 0; name: modesetting
    output eDP-1
    output DP-1
    output HDMI-1
    output DP-2
    output HDMI-2
    output DP-3
    output HDMI-3

lspci -nnk | egrep -A3 "VGA|Display|3D"
00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics 530 [8086:191b] (rev 06)
        Subsystem: Dell Device [1028:06df]
        Kernel driver in use: i915
        Kernel modules: i915
--
01:00.0 Display controller [0380]: Advanced Micro Devices, Inc. [AMD/ATI] Mars [Radeon HD 8670A/8670M/8750M / R7 M370] [1002:6600] (rev 81)
        Subsystem: Dell Device [1028:06df]
        Kernel modules: radeon, amdgpu
02:00.0 Network controller [0280]: Intel Corporation Wireless 8260 [8086:24f3] (rev 3a)

Hi
Ahh it’s GCN 1.0, AFAIK you would need the radeon then, but set the kernel options for radeon instead, not sure if it’s cik or si, try both with a 1 or 0.


radeon.si_support=1 radeon.cik_support=0

You need to remove the blacklist file and run mkinitrd again to add radeon back.

Alright, I tried all four combinations of cik and si.

This combination gave no error on bootup.

radeon.si_support=1 radeon.cik_support=0

Neither did this one.

radeon.si_support=0 radeon.cik_support=1

But this one:

radeon.si_support=0 radeon.cik_support=0

and this one did.

radeon.si_support=1 radeon.cik_support=1

So, I think that we can infer that

radeon.si_support=1

causes the error…
But, each time, the above referenced command

lspci -vnnn | perl -lne 'print if /^\d+\:.+(\\S+\:\S+\])/' | grep VGA

shows that the integrated graphics are still active…

Correction.

These two combinations gave the VCE fail resume error on bootup.

radeon.si_support=1 radeon.cik_support=0
radeon.si_support=1 radeon.cik_support=1

These two did not.

radeon.si_support=0 radeon.cik_support=0
radeon.si_support=0 radeon.cik_support=1

Sorry for the confusion.

Interestingly enough, the two combinations listed above which to not cause the error on bootup DO cause the Radeon GPU to disappear from CoreCtrl!
And for both of the combinations which cause the “failed VCE resume” error, the Radeon GPU appears in CoreCtrl.

Also, another note:
Sometimes, the error is:


failed VCE resume (-110)

and sometimes it is:


failed VCE resume (-100)

Hi
So can you check the output from the lspci and xrandr commands show for the relevant options when the gpu is present (radeon.si_support=1 radeon.cik_support=0)

Sure thing, here are the results with

radeon.si_support=1 radeon.cik_support=0
xrandr --listproviders
Invalid MIT-MAGIC-COOKIE-1 keyProviders: number : 2
Provider 0: id: 0x6b; cap: 0xf (Source Output, Sink Output, Source Offload, Sink Offload); crtcs: 3; outputs: 7; associated providers: 1; name: modesetting
    output eDP-1
    output DP-1
    output HDMI-1
    output DP-2
    output HDMI-2
    output DP-3
    output HDMI-3
Provider 1: id: 0x41; cap: 0xf (Source Output, Sink Output, Source Offload, Sink Offload); crtcs: 2; outputs: 0; associated providers: 1; name: OLAND @ pci:0000:01:00.0
/sbin/lspci -nnk | egrep -A3 "VGA|Display|3D"
00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics 530 [8086:191b] (rev 06)
        Subsystem: Dell Device [1028:06df]
        Kernel driver in use: i915
        Kernel modules: i915
--
01:00.0 Display controller [0380]: Advanced Micro Devices, Inc. [AMD/ATI] Mars [Radeon HD 8670A/8670M/8750M / R7 M370] [1002:6600] (rev 81)
        Subsystem: Dell Device [1028:06df]
        Kernel driver in use: radeon
        Kernel modules: radeon, amdgpu

Thanks much,
Philip

Hi
So you can offload to the secondary gpu ok?


DRI_PRIME=1 glxinfo | grep "OpenGL renderer"
or
DRI_PRIME=pci:0000:01:00.0 glxinfo | grep "OpenGL renderer"

I would be tempted to try the amdgpu again with the corresponding radeon settings;


amdgpu.si_support=1 amdgpu.cik_support=0

The vce warning should disappear if can switch to amdgpu.

I ran both commands you sent - here is the output.

> DRI_PRIME=1 glxinfo | grep "OpenGL renderer"
Invalid MIT-MAGIC-COOKIE-1 keyOpenGL renderer string: AMD OLAND (DRM 2.50.0, 5.3.18-59.27-default, LLVM 11.0.1)

> DRI_PRIME=pci:0000:01:00.0 glxinfo | grep "OpenGL renderer"
Invalid MIT-MAGIC-COOKIE-1 keyOpenGL renderer string: Mesa DRI Intel(R) HD Graphics 530 (SKL GT2)

I re-blacklisted radeon, ran mkinitrd, and added the kernel options

amdgpu.si_support=1 amdgpu.cik_support=0

That cleaned up the VCE error, but now there’s a new one.

     4.691081] kfd kfd: OLAND not supported in kfd

The Intel integrated graphics still show as active.

Here’s the output of the last two commands you sent, run under these new conditions:


>DRI_PRIME=1 glxinfo | grep "OpenGL renderer"
Invalid MIT-MAGIC-COOKIE-1 keyOpenGL renderer string: AMD Radeon (TM)
 R7 M370 (OLAND, DRM 3.39.0, 5.3.18-59.27-default, LLVM 11.0.1)
>DRI_PRIME=1 glxinfo | grep "OpenGL renderer"
Invalid MIT-MAGIC-COOKIE-1 keyOpenGL renderer string: AMD Radeon (TM)
 R7 M370 (OLAND, DRM 3.39.0, 5.3.18-59.27-default, LLVM 11.0.1)
philip@localhost:~> DRI_PRIME=pci:0000:01:00.0 glxinfo | grep "OpenGL renderer"
Invalid MIT-MAGIC-COOKIE-1 keyOpenGL renderer string: Mesa DRI Intel(R) HD Graphics 530 (SKL GT2)

Just FYI, I’ll be out of town tomorrow, so I may not have much time to work on this, but I’ll get back to you as soon as I can!
Thanks much for your help!

Philip

Hi
So see how it goes on other applications with DRI_PRIME=1 option. The kfd, well that can be ignored AFAIK I see if for TOPAZ on my dual gpu HP laptop (that’s running SLED 15 SP3).

Ok, so I ran DRI_PRIME=1 as below:

>DRI_PRIME=1 glxinfo | grep "OpenGL renderer"
Invalid MIT-MAGIC-COOKIE-1 keyOpenGL renderer string: AMD Radeon (TM)
 R7 M370 (OLAND, DRM 3.39.0, 5.3.18-59.27-default, LLVM 11.0.1)

But when I check which gpu is active, I still get:

lspci -vnnn | perl -lne 'print if /^\d+\:.+(\\S+\:\S+\])/' | grep VGA
00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics 530 [8086:191b] (rev 06) (prog-if 00 [VGA controller])

In CoreCtrl, I can still see the Radeon GPU. Before, it was saying it was running at 300 MHz, but now it is running at 750 MHz. However, it constantly shows Power: 0W and Activity 0%.

And when I run glmark2, it begins by saying it is using the integrated graphics.

>glmark2
Invalid MIT-MAGIC-COOKIE-1 key=======================================================
    glmark2 2021.02
=======================================================
    OpenGL Information
    GL_VENDOR:     Intel Open Source Technology Center
    GL_RENDERER:   Mesa DRI Intel(R) HD Graphics 530 (SKL GT2)
    GL_VERSION:    3.0 Mesa 20.2.4
=======================================================
[build] use-vbo=false: FPS: 3552 FrameTime: 0.282 ms
[build] use-vbo=true: FPS: 3661 FrameTime: 0.273 ms
[texture] texture-filter=nearest: FPS: 3448 FrameTime: 0.290 ms
[texture] texture-filter=linear: FPS: 3393 FrameTime: 0.295 ms
[texture] texture-filter=mipmap: FPS: 3371 FrameTime: 0.297 ms
[shading] shading=gouraud: FPS: 3051 FrameTime: 0.328 ms
[shading] shading=blinn-phong-inf: FPS: 2826 FrameTime: 0.354 ms
[shading] shading=phong: FPS: 2677 FrameTime: 0.374 ms
[shading] shading=cel: FPS: 2757 FrameTime: 0.363 ms
[bump] bump-render=high-poly: FPS: 1986 FrameTime: 0.504 ms
[bump] bump-render=normals: FPS: 3693 FrameTime: 0.271 ms
[bump] bump-render=height: FPS: 3567 FrameTime: 0.280 ms
[effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 2011 FrameTime: 0.497 ms
[effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 1080 FrameTime: 0.926 ms
[pulsar] light=false:quads=5:texture=false: FPS: 3119 FrameTime: 0.321 ms
[desktop] blur-radius=5:effect=blur:passes=1:separable=true:windows=4: FPS: 1091 FrameTime: 0.917 ms
[desktop] effect=shadow:windows=4: FPS: 1962 FrameTime: 0.510 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 872 FrameTime: 1.147 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata: FPS: 733 FrameTime: 1.364 ms
[buffer] columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 986 FrameTime: 1.014 ms
[ideas] speed=duration: FPS: 2383 FrameTime: 0.420 ms
[jellyfish] <default>: FPS: 2037 FrameTime: 0.491 ms
[terrain] <default>: FPS: 245 FrameTime: 4.082 ms
[shadow] <default>: FPS: 2151 FrameTime: 0.465 ms
[refract] <default>: FPS: 495 FrameTime: 2.020 ms
[conditionals] fragment-steps=0:vertex-steps=0: FPS: 2817 FrameTime: 0.355 ms
[conditionals] fragment-steps=5:vertex-steps=0: FPS: 2866 FrameTime: 0.349 ms
[conditionals] fragment-steps=0:vertex-steps=5: FPS: 2818 FrameTime: 0.355 ms
[function] fragment-complexity=low:fragment-steps=5: FPS: 2824 FrameTime: 0.354 ms
[function] fragment-complexity=medium:fragment-steps=5: FPS: 2893 FrameTime: 0.346 ms
[loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS: 2828 FrameTime: 0.354 ms
[loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS: 2813 FrameTime: 0.355 ms
[loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS: 2864 FrameTime: 0.349 ms
=======================================================
                                  glmark2 Score: 2420 
=======================================================

Hi
You need to use the DRI_PRIME=1 part before the command :wink:

What desktop are you running?

I use GNOME as it integrates switcherooctl (albeit from the hardware repo) to use right-click to select the discrete graphics.

:smiley: Yeah, I did run that command first, and it looks like it switches, but every time I run the second command, it appears that the integrated graphics are being used…

philip@localhost:~> DRI_PRIME=1 glxinfo | grep "OpenGL renderer"OpenGL renderer string: AMD Radeon (TM)
 R7 M370 (OLAND, DRM 3.39.0, 5.3.18-59.34-default, LLVM 11.0.1)
philip@localhost:~> sudo lspci -vnnn | perl -lne 'print if /^\d+\:.+(\\S+\:\S+\])/' | grep VGA
[sudo] password for root: 
00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics 530 [8086:191b] (rev 06) (prog-if 00 [VGA controller])
philip@localhost:

I normally run KDE, but I use Gnome from time to time.

If I get a chance tomorrow, I’ll install switcherooctl and give it a shot!

Hi
That command will always show intel as that has VGA, the AMD Card has Display in lspci output… that’s why use egrep “VGA|Display|3D”

Aah… So that’s a bit of good old user error on my part! :shame: :smiley:

Now, when I run glmark2 (after running the DRI_Prime=1 command), it still shows

>glmark2
Invalid MIT-MAGIC-COOKIE-1 key=======================================================
    glmark2 2021.02
=======================================================
    OpenGL Information
    GL_VENDOR:     Intel Open Source Technology Center
    GL_RENDERER:   Mesa DRI Intel(R) HD Graphics 530 (SKL GT2)
    GL_VERSION:    3.0 Mesa 20.2.4
======================================================

And my assumption is that that means that we are still using integrated graphics.
Am I misinterpreting that as well?
I’ll get switcherooctl installed after work today. Once I figure out how to add that hardware repository.

Thanks much,
Philip