After a power outage my server has been unable to boot back into suse linux 11.1
In normal boot mode (run level 5) an error having to do with GDM originally stated something along the lines of the log directory of “/var/log/gdm” did not exist. This error prevented the server from booting. This slightly made sense as I had previously cleared out the “/var/log” directory as it had entirely filled its partition. I could not figure out how to force the server to boot without GDM (ie. as just command line), in order to try and fix the error by creating the “/var/log/gdm” directory. Instead I used the suse live cd to mount the hard drive and create the directory. Now though, the error has changed and is something along the lines of “could not spawn child process /usr/lib/gdm/gdm-simple-slave” when starting GDM [2627]. I am lost as how to fix this error. This is a critical server for business use, so it is imperative that I get it back up and running ASAP. I am going to attempt to use the suse 11.1 DVD to repair the installation in an attempt to fix the problem, but any help anyone can provide will be greatly appreciated.
Assuming this method works to get me into the command line interface (I do not have access to the server right now), how would I go about fixing the GDM problem?
I just tried setting the run-level to 3 to get the the CLI at boot as you described, but it does not seem to have worked. The boot screen just stops after having set up the network interfaces - and it does not report any errors. Nor does it give an option to log in. Is there something I’m missing?
Have you tried chrooting in? TBH I think this is pointing at something else, my initial thought was a dirty flag for the fs(But I thought repair did a fsck, not to mention on the first boot after). Another is perhaps something related to the logs(better to let the system handle it(sysconfig) on a reboot than aggressive pruning).
Any way I think with a chroot you’ll be able to investigate better. i.e tune2fs -l see the fs state, unmount and run fsck, zypper in, zypper verify, startx(it’ll have to be another display), try to bring up interfaces etc, etc…
I’m not that experienced with linux, so please forgive my ignorance. How do I chroot in? Would I not first require command line access to the system (which I cannot yet achieve). If you could describe the process in a more step by step manor, that would be fantastic
And they let you manage a business critical server ? Tsjk. Not you fault, welcome here.
Does the system boot in runlevel 1 ? Type “init 1” on the options line. The rootpassword makes you login. If that does work, check the permissions of /var/log/gdm, they might be wrong. To set them to what they should be:
Well yes but you can get cli from a variety of ways, other installed distro’s, live cd’s, install media, you say runlevel 3 doesn’t work you could try runlevel 1(Though I can’t remember what tools are available in 1).
Once the chroot is setup iirc the network will be down and resolv.conf more often than not may need tweaking. If it is set up fine you should have cli yast to assist otherwise it may involve manually bringing it up.
I work for a small business which can’t afford to higher full time technical staff, but most of the time the server does what its supposed to without any hassle.
But anyways, I was able to get the server to boot using run level 1, and make sure the permissions on the GDM directory are correct. That has not changed anything though. I also tried the
zypper in -f gdm
command that caf4926 suggested, but it returns an error saying it cannot find/access the RPM (sorry about the vague errors, I am trying to fix the server remotely through another individual via phone atm).
Also, FeatherMonkey, could you explain what you intended by these commands:
Mmm those are just examples of investigative commands.
Normally on a hard reset the fs will be marked as dirty(iirc) meaning it needs a fsck. In regards to the zypper in command I suspect that is related to the lack of network, rcnetwork start may bring it up.
Honestly I can go on further but this is trying to fix a server via an intermediary, sounds like a bad way of fixing this to me. Vague messages will cause further problems.
Until you know exactly what the problem is fixing could be breaking more than it is fixing. I’m not sure what zypper in gdm is going to fix, and tbh you’ve got around that. The last bit of useful info was stalling in the boot sequence at/after networking.
When trying to use the suse 11.1 DVD to repair the installation I repeatedly get the error “changing environment to target system was not successful” and the repair process seems to have no effect on booting the system. What does this error mean?
Also, I had previously created backup configuration files of the stable system. How can I use these files to restore the configuration after a fresh install? The one catch is that these backup files are on a partition on the system (but not the main system partition). Furthermore, if I did a fresh install would files not on the main system partition be effected?
changing environment to target system was not successful
from a quick google it would look like what it says really, iirc things like raid or other complex installs may have an affect. I think this happening when it trys to change to root of the installed distro.
The fact that you have missing pieces indicates the file system is corrupted. Your floundering attempts to fix things may make things worse. If repair does not run it indicates that the problem is very serious. The only real solution I see is a new install. I assume you may have mission critical data somewhere. You should first attempt to backup that data if you don’t already have one. If you tell us the types of things you are doing and the programs you run then we may be able to tell where to look for the data. Second check the disk by using a scan utility from the maker. If it passes fine if not the disk is damaged replace it.