Page 1 of 2 12 LastLast
Results 1 to 10 of 11

Thread: Corrupt root partition

  1. #1

    Default Corrupt root partition

    I'm running openSUSE 11.2 64-bit for the moment on my desktop. I've been having trouble getting 11.4 to download with good checksums. In an attempt to get that file, I downloaded the 64-bit DVD via aria2c overnight. I did it via sudo as the only partition large enough to hold it was /. This morning all looked good. I checked the SHA1 checksum and it looked good. I wanted to check the MD5 just to be safe. I downloaded the md5 checksum file from here and saved it in /home/<user>/Downloads/. The first couple of times I did this, I clicked on the link in Firefox and did "File | Save As.." Nothing happened. I didn't get the dialog asking where to save it (I have it set to always ask where to save it). I then right clicked on the link and selected "Save link as..." That allowed me to save it. I switched to root in the terminal and mv'd the file to / so it would be in the same file as the iso. I then cd'd to root to run the md5 check and did a quick "ls -lia". It gave me an error saying something like "/usr/bin ls not a valid file or directory." (or something very similar to that) I tried opening the KDE file manager (forget what it's called) but it didn't open. As things were acting weird at that point, I figured some process was hung or something, so I'd reboot. I exited the root mode and the terminal, shutdown all apps and tried to shut down via the KDE menu. It started to shut down, but then went to console and came up with an error. I don't remember what it was. It was long or I would have written it down. I tried accessing the console various ways, but only got that error. I had to do a hard reset. When it booted, I got the following:
    Code:
    doing fast boot
    Creating device nodes with udev
    Trying manual resume from /dev/disk/by-id/ata-ST3160023AS_blah_blah-part9
    Invoking userspace resume from /dev/disk/by-id/ata-<same as above>-part9
    resume: libgcrypt version 1.4.4
    Trying manual resume from /dev/disk/by-id/<same as above>-part9
    Invoking in-kernel resume from /dev/disk/by-id/<same as above>part9
    Waiting for device /dev/disk/by-id/<same as above>-part7 to appear:  ok
    fsck from util-linux-ng 2.16
    [sbin/fsck.ext4 (1) -- /] fsck.ext4 -a /dev/sda7
    /dev/sda7: clean, 8118/3474800 files, 1486910/13894209 blocks
    fsck succeeded.  Mounting root device read-write.
    Mounting root /dev/disk/by-id/<same as above>-part7
    mount -o rw,acl,user_xattr -t ext4 /dev/disk-by-id<same as above>-part7 /root
    No init found.  Try passing init= option to the kernel.
    umount:  /dev: device is busy.
                  (In some cases useful info about processes that use
                    the device is found by isof(8) or fuser(1))
    [        3.269491] Kernel panic - not syncing: Attempted to kill init!
    I rebooted again with the install disk and ran the repair tools. The first time, I got messages saying that sda2, sda4, and sda7 were corrupt with the option to repair. When I clicked on "repair" it didn't appear to do anything, but the message popped right back up. I hit the repair button several times on each drive (10 or so times), then hit "skip". At the end, I got two errors saying no root partition was found.

    I rebooted and ran it again. This time, it only gave me the corruption error on sda7 (root, of course). I again his "repair" several times, and again got the two "root not found" errors and rebooted to the recovery system. I then manually ran fsck on all partitions. They all seemed fine. sda7 took a while and it did say it was repairing. It succeeded. Reboot. Same error as above (in code block). If I try it in "failsafe" mode, same thing (not really a surprise).

    I also tried running the partition manager from the repair utilities. It sees each of my partitions, but does not see the mount points. It will not let me specify the mount points unless I choose to format the drive. Obviously, I don't want to do that.

    So I'm pretty sure my root partition is corrupt. What are my options? Is there anything I can do to recover that partition? I was thinking about doing a re-install and just not formatting my /home and /usr partitions. However, since the install utility doesn't recognize the partition mount points, I don't think I can do that. Is there a way around that?

    I don't have a problem re-installing my entire system, but I don't want to lose what's in /home and /usr (each are separate partitions on the same disk). I've got data in /usr, and documents I don't want to lose in /home.

    Do you think it could be a hardware issue? I'm thinking about running Spinrite 6.0 on the disk, but that takes forever, so I can't try anything else while that's running.
    Last edited by Yippee38; 30-Mar-2011 at 14:31. Reason: corrections for clarity

  2. #2
    Join Date
    Nov 2009
    Location
    West Virginia Sector 13
    Posts
    16,285

    Default Re: Corrupt root partition

    When a disk is repaired sometimes the fsck program can not piece all things back together. So anything it does not know what to so with ends up in lost&found. If it is binaries it may be impossible to actually splice them back together. So if root has had a serious corruption about the only thing to do is reinstall. The installer does not know what the mount points are it only gives a suggestion if it can. You need to go to expert mode to set the mount points and if you want each partition formatted and if so how.

    This may be an indication of a failing disk since you have Spinrite I'd run it, even if it does take all night. Also I'd boot from a CD OS and backup all important stuff before doing anything else.

  3. #3

    Default Re: Corrupt root partition

    Quote Originally Posted by gogalthorp View Post
    When a disk is repaired sometimes the fsck program can not piece all things back together. So anything it does not know what to so with ends up in lost&found. If it is binaries it may be impossible to actually splice them back together. So if root has had a serious corruption about the only thing to do is reinstall. The installer does not know what the mount points are it only gives a suggestion if it can. You need to go to expert mode to set the mount points and if you want each partition formatted and if so how.
    How can I get to the lost+found file? I mounted it in the recovery system. I can see the directories on it, and I can cd into lost+found. It appears empty though.

  4. #4
    Join Date
    Nov 2009
    Location
    West Virginia Sector 13
    Posts
    16,285

    Default Re: Corrupt root partition

    Well maybe it is clean. But it does look like something was lost or you would be able to boot.

    Since you can mount it maybe pike around a bit and see if all look well. I'd look at /boot/grub first and check the menu.lst file. From there I'd use the menu.lst references to look to see if the kernel is still there, etc.

    Do you get anything message when you try to boot?

  5. #5

    Default Re: Corrupt root partition

    Quote Originally Posted by gogalthorp View Post
    Well maybe it is clean. But it does look like something was lost or you would be able to boot.

    Since you can mount it maybe pike around a bit and see if all look well. I'd look at /boot/grub first and check the menu.lst file. From there I'd use the menu.lst references to look to see if the kernel is still there, etc.

    Do you get anything message when you try to boot?
    I only get the error in the code block above.

    Kernel is still there.

    I do see something odd though. When I boot into recovery system, I see the following directories when I ls in root:
    bin
    boot
    dev
    etc
    home
    lib
    lib64
    media
    mnt
    mounts
    parts
    proc
    root
    sbin
    sys
    tmp
    usr
    var.

    However, when I switch to my recently mounted root partition, I only see the following:
    boot
    dev
    home
    lost+found
    opt
    proc
    sys
    usr
    var
    windows.

    There's no /etc. Even weirder. I did a find on /etc and it shows up in my windows directory. I did a quick cat on fstab located there and it appears to be the correct file. It's almost like it got moved to another directory; as if fsck recovered it to the wrong place. In fact, if I compare what's in the "windows" directory to what's in my root directory all of the directories in the recovery system root are there except for "mounts" and "parts".

    The "windows" directory is just a directory which holds the directories where I mount my XP partitions (read-only, of course). So besides those WinXP directories, "windows" now holds:
    bin
    etc
    lib
    lib64
    lost+found
    media
    mnt
    root
    sbin
    selinux
    srv
    tmp.

    BTW, that lost+found is also empty. All of them are. Is it possible that there is a permission thing that is keeping me from seeing them? It seems odd that linux loses all security as soon as I have an install disk.

  6. #6
    Join Date
    Nov 2009
    Location
    West Virginia Sector 13
    Posts
    16,285

    Default Re: Corrupt root partition

    Looks like fsck moved them to the wrong place. Things must have been seriously damaged. For me I'd reinstall. You might want to preserve etc contents for reference since it may contain files that might effect usr binaries. I'd also check that drive.

  7. #7

    Default Re: Corrupt root partition

    Quote Originally Posted by gogalthorp View Post
    Looks like fsck moved them to the wrong place. Things must have been seriously damaged. For me I'd reinstall. You might want to preserve etc contents for reference since it may contain files that might effect usr binaries. I'd also check that drive.
    I moved all of those directories back to root, ran the repair utility and now it's booting up normally. I will probably do a fresh install of 11.4 (if I can ever get a good download). But at least now I can backup my data. I looked through a bunch of the log files and I did see SMART - "prefailure" messages. However, they were on my WinXP drive. I'm in the process of replacing that drive, so that's not catastrophic.

    I am also going to run Spinrite on all of my drives, just to be sure everything's cool.

    Thanks so much for your help! You got me on the right track.

  8. #8
    Join Date
    Nov 2009
    Location
    West Virginia Sector 13
    Posts
    16,285

    Default Re: Corrupt root partition

    Glad to help that was a tough problem.

  9. #9
    Join Date
    Feb 2009
    Location
    Spain
    Posts
    25,547

    Default Re: Corrupt root partition

    On 2011-03-30 23:36, Yippee38 wrote:

    (I know you solved the issue, just some comments for next time)

    > I'm running openSUSE 11.2 64-bit for the moment on my desktop. I've
    > been having trouble getting 11.4 to download with good checksums. In an
    > attempt to get that file, I downloaded the 64-bit DVD via aria2c
    > overnight. I did it via sudo as the only partition large enough to hold
    > it was /.


    That's a mistake. You should have created a new directory under root, then
    given yourself write permission on it, and run the download there. Why?
    because, for example, root can fill completely a partition, while a user
    process is killed leaving a small margin. I suspect your root partition
    filled, plus something I don't know.

    Question: what filesystem type are you using?

    > I rebooted again with the install disk and ran the repair tools.


    I don't trust that automated repair tool. I've never been able to do
    anything good with it.

    --
    Cheers / Saludos,

    Carlos E. R.
    (from 11.2 x86_64 "Emerald" at Telcontar)

  10. #10
    Join Date
    Feb 2009
    Location
    Spain
    Posts
    25,547

    Default Re: Corrupt root partition

    On 2011-03-31 03:36, Yippee38 wrote:
    >
    > I looked through a bunch of the log files and I did see SMART
    > - "prefailure" messages.


    Not necessarily important. It may simply mean that a value changed. Look at
    mine:

    <3.6> 2011-03-31 11:05:10 Telcontar smartd 3240 - - Device: /dev/sdc
    [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 105
    to 106
    <3.6> 2011-03-31 11:05:10 Telcontar smartd 3240 - - Device: /dev/sdc
    [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 39 to 40


    They are trivial, happening all the time. You have to interpret what your's
    says.

    --
    Cheers / Saludos,

    Carlos E. R.
    (from 11.2 x86_64 "Emerald" at Telcontar)

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •