NVIDIA - lnvhw script - "X running" failure

James - I hope this will catch your attention, mentioning one of your finer works usually does ;):wink:

I am back from a 3 month holiday and did a significant update to my 12.1/KDE 4.8 Nvidia test system, as expected.
It included a new kernel, so I tried to run ā€œlnvhwā€ ver 1.45 which I have installed.

The new kernel I have is 3.1.10-1.16-desktop, which arrived in August, as I recall.
After running a ā€œupgrade all rpms with newer versionsā€ in YAST, I rebooted to install this kernel, knowing that I would have to run lnvhw from level 3 to recompile for the new kernel.

I rebooted with systemd, then with systemV, specifying ā€œ3ā€ in the options line and both times lnvhw failed with an ā€œX server runningā€ error message.
Executing ā€œinit 3ā€ as root did not help.
So, frustrated, as root, I blew away /tmp/.X1-lock and ran lnvhw, the new kernel modules were created and life is good again at level 5 after reboot.
The time stamp on the /var/.X1-lock file seemed current to the recent reboot, it was not left over from an older session.

Any thoughts on why X appears to be starting even when level 3 is specified?

So, I am not sure how x could be running when at Run Level 3. Of course, in systemd, its not really Run Level 3 like it used to be. Also, I can not recommended switching to systemV and certainly switching back and forth is very likely to get something confused. So, not an explanation, but as a help to anyone needing to use Run Level 3 to load the nVIDIA driver, I have put together this blog on the subject:

How to Start openSUSE 12.2 with Grub 2 into Run Level 3 - Blogs - openSUSE Forums

I can say that I did have issues at first with systemd in openSUSE 12.1 which seemed to get corrected later and so a new install of openSUSE 12.1, not yet updated might show the problem as you describe. I normally manually add the nomodeset command to get openSUSE to load with some sort of display. Do a full distro update, which may add fixes to systemd and then go for installing the latest nVIDIA driver. This seems to have worked for me.

Thank You,

Thanks James, interesting Blog.
I’ll bookmark it for reference when 12.2 arrives next week.

What particularly bothered me was that, as root, ā€œinit 3ā€ did not seem to remove the lock file /tmp/.X1-lock.

For other readers, removing the lock file did permit lnvhw to run and did reinstall the Nvidia driver for me (a new kernel had been downloaded).
I can’t promise that this will work in all situations.

So its interesting to read about this. With X running I find that I have a hidden file called /tmp/.X0-lock which makes me wonder if you have more than one video card or monitor connected?

Thank You,

This particular PC is my test PC, here are the contents of /etc/X11/xorg.conf:

cat /etc/X11/xorg.conf
# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig:  version 295.53  (buildmeister@swio-display-x86-rhel47-07.nvidia.com)  Sat May 12 00:34:20 PDT 2012


Section "ServerLayout"
    Identifier     "Layout0"
    Screen      0  "Screen0" 0 0
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
EndSection

Section "Files"
EndSection

Section "InputDevice"

    # generated from default
    Identifier     "Mouse0"
    Driver         "mouse"
    Option         "Protocol" "auto"
    Option         "Device" "/dev/psaux"
    Option         "Emulate3Buttons" "no"
    Option         "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"

    # generated from default
    Identifier     "Keyboard0"
    Driver         "kbd"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    HorizSync       28.0 - 33.0
    VertRefresh     43.0 - 72.0
    Option         "DPMS"
EndSection

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection


I see no reason for the X to think there are two cards.

This GPU does have dual output connectors, and (I believe) the display I am using is the Analog output.
Perhaps that is why.

So it is very likely the Analog output is the reason for the number 1 being used. Strictly speaking the configuration file /etc/X11/xorg.conf has been deprecated. It should be left blank (for compatibility) and your settings moved to the following folder /etc/X11/xorg.conf.d where separate files for monitor and keyboard exist. I found one good read here, though not specifically for openSUSE, the examples should be the same.

https://wiki.archlinux.org/index.php/Xorg#Monitor_settings

It is worth a look. Just backup your present file and move the settings over and see what you get.

Thank You,

This file means that screen 0 is currently used and locks this screen, so that it doesn’t get used by another X session. If you start another X session whether in fullscreen (on a page on your desktop) or in a Window (for example with Xnest or Xephyr) or if you switch to another tty and start another X session from there, you will get a /tmp/X1-lock file, etc. It has nothing to do with video cards and monitors.

Because the X server wasn’t reset properly (for some reason) and didn’t remove the lock file. It is safe to remove left over Xn-lock files when X is not running. But how come you start numbering screens at 1? Do you have an /tmp/.X0-lock? How do you start X (xdm, gdm, kdm, startx … something else?)

Why should it? The lock file doesn’t get remove when you start something (or enter a runlevel) but when you exit something (more precisely an X session in case of /tmp/.X?-lock files).
Sorry, I read the posts one by one.

It doesn’t think there are two cards? It just believes another screen is beeing used.
I’ll try to give you an example - it will use a script called ā€˜lsx’ to list X sessions.

I first start kdm from the command line. Then I switch to a tty (using CTRL-ALT-Fn) and start another X session with startx.
It gives me the following:

 # lsx

    :0  tty7    6845     0  Ss+ 14:02:54  /usr/bin/Xorg -br :0 vt7 -auth /var/lib/xdm/authdir/authfiles/A:0-QGL4gc
        =>      6850     0  S   14:02:55  -:0       

    :1  tty2    6920     0  S<s+14:06:40  /usr/bin/Xorg :1 -auth /root/.serverauth.6879
        =>      6925     0  Ss  14:06:40  sh /etc/X11/xinit/xinitrc
        =>      7019     0  S   14:06:40  icewm-session

 # find /tmp  -name ".X?-lock" -ls
    13    4 -r--r--r--   1 root     root           11 Sep  3 14:02 /tmp/.X0-lock
    18    4 -r--r--r--   1 root     root           11 Sep  3 14:06 /tmp/.X1-lock


There are 2 X sessions on tty7 and tty2 and two lock files .X0-lock and .X1-lock.

Now after starting another X session in a window, I get his: (you can see the command I used in lsx ouput) :


# lsx
    :0  tty7    6845     0  Ss+ 14:02:53  /usr/bin/Xorg -br :0 vt7 -auth /var/lib/xdm/authdir/authfiles/A:0-QGL4gc
        =>      6850     0  S   14:02:54  -:0       

    :1  tty2    9907     0  Ss+ 14:18:50  /usr/bin/Xorg :1 -auth /home/openSUSE/agnelo/.serverauth.9865
        =>      9912  1001  Ss  14:18:51  sh /etc/X11/xinit/xinitrc
        =>     10003  1001  S   14:18:51  icewm-session

    :2  pts/2  12084  1001  S+  14:26:42  Xephyr -ac -mouse ephyr -keybd ephyr -screen 1575x1018 -dpi 96 -br -noreset -once :2
        =>     12085  1001  S+  14:26:42  ck-launch-session sawfish

# find /tmp -name ".X?-lock" -ls
    13    4 -r--r--r--   1 root     root            11 Sep  3 14:02 /tmp/.X0-lock
    27    4 -r--r--r--   1 agnelo   tournesol       11 Sep  3 14:26 /tmp/.X2-lock
    14    4 -r--r--r--   1 root     tournesol       11 Sep  3 14:18 /tmp/.X1-lock

The new session is on pts/2 (not a tty but a window) and uses screen :2 (which is locked by /tmp/.X2-lock).

Well, let me X-query to another computer and …

# lsx

    :0  tty7    6845     0  Ss+ 14:02:53  /usr/bin/Xorg -br :0 vt7 -auth /var/lib/xdm/authdir/authfiles/A:0-QGL4gc
        =>      6850     0  S   14:02:54  -:0       

    :1  tty2    9907     0  Ss+ 14:18:50  /usr/bin/Xorg :1 -auth /home/openSUSE/agnelo/.serverauth.9865
        =>      9912  1001  Ss  14:18:51  sh /etc/X11/xinit/xinitrc
        =>     10003  1001  S   14:18:51  icewm-session

    :2  pts/2  12084  1001  S+  14:26:42  Xephyr -ac -mouse ephyr -keybd ephyr -screen 1575x1018 -dpi 96 -br -noreset -once :2
        =>     12085  1001  S+  14:26:42  ck-launch-session sawfish

    :3  pts/4  16453  1001  S+  14:36:09  Xephyr -query 192.168.101.18 -from 192.168.101.9 -screen 1575x1018 -dpi 96 -br -noreset -once :3
        =>                                ck-launch-session sawfish

# find /tmp -name ".X?-lock" -ls
    13    4 -r--r--r--   1 root     root            11 Sep  3 14:02 /tmp/.X0-lock
    27    4 -r--r--r--   1 agnelo   tournesol       11 Sep  3 14:26 /tmp/.X2-lock
    14    4 -r--r--r--   1 root     tournesol       11 Sep  3 14:18 /tmp/.X1-lock
    29    4 -r--r--r--   1 agnelo   tournesol       11 Sep  3 14:36 /tmp/.X3-lock

Now I have 4 X sessions running, 4 screens (:0, :1, :2, :3) and 4 lock files.

Do you still believe that? :wink:

My bad, after reading pta’s info I realized that the X session running is most likely my VNC server, serving up on display :1.
Doh!

Also notice that removing the X lock file might lead the X server to connect to an orphan socket. So if you intend to start an X session without rebooting, you must delete the dead socket.

Example:

# lsx

    :0  tty7   27307  root          Ss+ 12:43:28  /usr/bin/X -br :0 vt7 -nolisten tcp -auth /var/run/xauth/A:0-gWJJDa
        =>     27313  root          S   12:43:29  kdm       
        =>     27336  agnelo        Ss  12:43:49  /usr/local/config/kdm/Xsession icewm-session

    :1  pts/1  20138  agnelo        S+  13:44:08  Xephyr -ac -mouse ephyr -keybd ephyr -screen 1575x1018 -dpi 96 -br -noreset -once :1
        =>     20139  agnelo        S+  13:44:08  ck-launch-session fvwm2

    :2  pts/6  20686  agnelo        S+  13:44:44  Xephyr -query neelix -screen 1575x1018 -dpi 96 -br -noreset -once :2
        =>                                        ck-launch-session fvwm2

    :3  pts/7  14513  root          S+  14:31:44  Xephyr -ac -mouse ephyr -keybd ephyr -fullscreen -name XEPHYR -dpi 96 -br -noreset -once :3
        =>     14514  root          S+  14:31:44  ck-launch-session ctwm

    :4  pts/12 15401  agnelo        S+  14:32:25  Xephyr -query neelix -fullscreen -name XEPHYR -dpi 96 -br -noreset -once :4
        =>                                        ck-launch-session ctwm

    :5  tty8   17143  root          Ss+ 14:34:31  X :5 -query neelix
        =>                                        ck-launch-session ctwm
    **:6**  -      18338     -  X   -         no process

So I removerd /tmp/.X6-lock and it’s gone:

# find /tmp -name ".X?-lock"
/tmp/.X0-lock
/tmp/.X2-lock
/tmp/.X4-lock
/tmp/.X1-lock
/tmp/.X3-lock
/tmp/.X5-lock

However the socket is still there:

# find /tmp/.X11-unix/ -type s    
/tmp/.X11-unix/X1
/tmp/.X11-unix/X3
/tmp/.X11-unix/X4
/tmp/.X11-unix/X2
/tmp/.X11-unix/**X6**
/tmp/.X11-unix/X0
/tmp/.X11-unix/X5

And starting the next X session will fail with this error:

_XSERVTransSocketUNIXCreateListener: ...SocketCreateListener() failed
_XSERVTransMakeAllCOTSServerListeners: server already running

Fatal server error:
Cannot establish any listening sockets - Make sure an X server isn't already running

Deleting /tmp/.X11-unix/X6 will solve this problem.

Thanks for the extra follow on detail.
I suspected there could be side effects.

Since I usually reboot following lnvhw, I was not caught by them.

I also learned of a new command :lsx !

Well … This command is unfortunately not available since I haven’t released this script yet. It’s a complicated script I actually wrote to help me debug an even more complicated one. Maybe I’ll post them one day or create a package for openSUSE, but there is still a lot of work to do.