fixing superblock: running e2fsck on a mounted filesystem

My question is why does my use of ‘e2fsck -b 32768 <device>’ not work. Here is the blow-by-blow description of my problem and what I did to try to fix i:

Without known cause my system started wanting to enter filesystem repair during startup

error on stat() /dev/disk/by-id/scsi-200d04b341e047a1c-part3: No such file or directory
fsck.ext3: No such file or directoy while trying to open /dev/disk/by-id/scsi-200d04b341e047a1c-part3
bootsplash: status on console 0 changed to on
The superblock could not be read or does not describe a correct ext2 filesystem.

If I use Control-C at the prompt

Give root password for login:

Linux starts and I encounter no obvious problems while using it.
However, I would like to fix things properly, so following advice at Disk repair alternate superblocks - openSUSE I logged in at that prompt and then typed

dumpe2fs /dev/sda3 | grep Backup

to discover alternate superblocks. (I used ‘sda3’ because I have a dual-boot system on a Mac in which sda1 is the EFI Bootloader, sda2 is the MacOS X, sda3 is openSUSE, and sda4 is the Linux Swap.)
The first line was
Backup superblock at 32768, Group descriptors at 32769-32775
I then typed

e2fsck -b 32768 /dev/sda3

but the system responded

/dev/sda3 is mounted
WARNING!!! Running e2fsck on a mounted filesystem may cause SEVERE filesystem damage.

So I unmounted sda3 with

umount -f /dev/sda3

and continued with ‘e2fsck’ again.
I was told

/dev/sda3 was not cleanly unmounted, check forced.

It proceeded anyway. When it was done with Pass 5, it showed multiple instances of

Free blocks count wrong for group #n

and multiple instances of

Free inodes count wrong for group #n

which I approved, and finally it was done.

I restart the system at that point, but during startup I was put back at filesystem repair with the same complaint from fsck.

What can I do to repair the superblock on that partition? Why does the system complain when the OS starts up anyway and I can’t detect problems with the OS?

On Tue, 19 Aug 2008 01:26:13 GMT
fogelfish <fogelfish@no-mx.forums.opensuse.org> wrote:

> > Give root password for login:
> Linux starts and I encounter no obvious problems while using it.
> However, I would like to fix things properly, so following advice at
> ‘Disk repair alternate superblocks - openSUSE’
> (http://en.opensuse.org/Disk_repair_alternate_superblocks) I logged in
> at that prompt and then typed
> > dumpe2fs /dev/sda3 | grep Backup
> to discover alternate superblocks. (I used ‘sda3’ because I have a
> dual-boot system on a Mac in which sda1 is the EFI Bootloader, sda2 is
> the MacOS X, sda3 is openSUSE, and sda4 is the Linux Swap.)
> The first line was
> Backup superblock at 32768, Group descriptors at 32769-32775
> I then typed
> > e2fsck -b 32768 /dev/sda3
> but the system responded
> > /dev/sda3 is mounted
> > WARNING!!! Running e2fsck on a mounted filesystem may cause SEVERE
> > filesystem damage.
> So I unmounted sda3 with
> > umount -f /dev/sda3
> and continued with ‘e2fsck’ again.
> I was told
> > /dev/sda3 was not cleanly unmounted, check forced.
> It proceeded anyway. When it was done with Pass 5, it showed multiple
> instances of
> > Free blocks count wrong for group #n
> and multiple instances of
> > Free inodes count wrong for group #n
> which I approved, and finally it was done.
>
> I restart the system at that point, but during startup I was put back
> at filesystem repair with the same complaint from fsck.
>
> What can I do to repair the superblock on that partition? Why does the
> system complain when the OS starts up anyway and I can’t detect problems
> with the OS?
>
>

Oh my! You’re going to be a problem aren’t you? {Grin} you keep showing up
with new stuff… fun!

You mentioned that you “pressed CTRL-C” and used linux with no obvious
problems… You’re talking about using the system? logged in as you,
surfing the web and such?

And THEN you wanted to fix the filesystem?

You cannot just umount the root filesystem, not unless you’re running a
rescue system, or you’re using the ‘mini-system’ provided by the error
console you get when there’s an error on the disk (when you pressed CTRL-C)

At that prompt, you give the root password, then:

e2fsck -b 32768 /dev/sda3

Which should scan and check the partition for errors. Answer the prompts,
allowing e2fsck to fix the filesystem.

Once it finishes, I’d immediately run it again, but as:

e2fsck /dev/sda3

which would make sure the master superblock was usable.

if you MUST unmount the root filesystem while it’s running, you’ll want to
shift to runlevel 1 (one) to stop any other programs from trying to access and
write to the hard drive. then type:

mount -o remount,ro /

to remount the root filesystem as read only, then you need to add the
‘-f’ (force) option to the e2fsck command since the partition is still
technically mounted.

issue the ‘reboot’ command to restart, ignore any errors indicated if they
are related to the fact that the drive is no longer writable at that time.

If you did use the system ‘normally’ (able to surf the web, email, etc)
before fixing the problem… I’m smack your hands if I could… BAD
fogelfish! Bad! Running the system ‘normally’ after a superblock error can
be disastrous and fatal to your data. Yes, it worked THIS TIME.

You can also issue the e2fsck -f /dev/sda3 from the rescue disk which is
recommended. Works very nicely from there, with no complications from being
on the root drive. just issue ‘reboot’ from the ‘mini-system’ prompt after
having found your rescue DVD/CD.

Loni


L R Nix
lornix@lornix.com

New stuff finds me. Really.

I was in the ‘mini-system’ provided by the error console when I unmounted /dev/sda3. Not bad. But I also did use the OS quite a bit after seeing the bad superblock message. Bad.

But starting again with your directions, I went into the rescue system on the DVD and typed ‘e2fsck -f /dev/sda3’. All the checks completed with no alerts, no warnings, no errors.

Puzzling! I thought that partition had a problem. So I restarted normally. I was sent right back to the ‘repair filesystem’ prompt after the same system complaint.

Based on what I thought you wrote, I went back into the rescue system to run the ‘e2fsck -b 32768 /dev/sda3’ command and got this response:

e2fsck: Device or resource busy while trying to open /dev/sda3
Filesystem mounted or opened exclusively by another program?

I don’t understand how that partition could be mounted or opened if I started from the DVD. That bodes another problem, doesn’t it?

At the filesystem repair prompt during startup I used “dmesg | more” and found among the output

EXT3 FS on sda3, internal journal
EXT3-fs: mounted filesystem with ordered data mode

Does this indicate that sda3 is okay?
I seem to be getting mixed messages about the condition of that partition.
Can I use

mount -o sb=n /dev/hda1 /usr

where n = 4 * filesystem block size * logic position of alternate superblock as another way to fix the supposedly corrupt superblock?

On Tue, 19 Aug 2008 19:06:03 GMT
fogelfish <fogelfish@no-mx.forums.opensuse.org> wrote:

>
> At the filesystem repair prompt during startup I used “dmesg | more” and
> found among the output>
> > EXT3 FS on sda3, internal journal
> > EXT3-fs: mounted filesystem with ordered data mode
> Does this indicate that sda3 is okay?
> I seem to be getting mixed messages about the condition of that
> partition.
> Can I use
> > mount -o sb=n /dev/hda1 /usr
> where n = 4 * filesystem block size * logic position of alternate
> superblock as another way to fix the supposedly corrupt superblock?
>
>

Hmmm, please can you post the output of

fdisk -l

for me?

e2fsck -f -v /dev/sda3

… force check, verbose… should fix what ails you, of course

e2fsck -b 32768 -f -v /dev/sda3

might be needed if it complains that the superblock is still bad.

When it ‘fixes’ the partition, do you exit/CTRL-D to finish and reboot?

Please be aware that using the rescue disk holds a small chance of confusing
you. I replaced a 160GB drive with a 320GB drive in my monster machine last
night, encountered an error, and the Rescue disk brought up my drives in
reverse order… I load SATA, the PATA/ide drivers in the OS, while the
Rescue CD loaded them as PATA/ide, then SATA… during which, sda became sdb
and vice versa. You may not have this problem though, depending on
equipment installed. (This system has 7 HD’s, 4 CD/DVD’s)

I’m thinking…

Loni


L R Nix
lornix@lornix.com

Here’s the output from “fdisk -l” which was done in the DVD rescue system:

WARNING: GPT (GUID Partition Table) detected on ‘/dev/sda’! The util fdisk doesn’t support GPT. Use GNU Parted.

Disk /dev/sda: 250.0 GB, 250059350016 bytes
Disk Identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sda1 1 26 204819+ ee EFI GPT
/dev/sda2 26 15153 121503744 af Unknown
/dev/sda3 15153 29514 115354609 83 Linux
/dev/sda4 29514 30402 7135394+ b W95 FAT32

I won’t do anything else until you’ve had a chance to look at this. Thanks, Loni.

PS. There’s just one internal drive and one cd-dvd drive in the machine.

On Wed, 20 Aug 2008 01:26:03 GMT
fogelfish <fogelfish@no-mx.forums.opensuse.org> wrote:

>
> Here’s the output from “fdisk -l” which was done in the DVD rescue
> system:
> > WARNING: GPT (GUID Partition Table) detected on ‘/dev/sda’! The util
> > fdisk doesn’t support GPT. Use GNU Parted.
> >
> > Disk /dev/sda: 250.0 GB, 250059350016 bytes
> > Disk Identifier: 0x00000000
> >
> > Device Boot Start End Blocks Id System
> > /dev/sda1 1 26 204819+ ee EFI GPT
> > /dev/sda2 26 15153 121503744 af Unknown
> > /dev/sda3 15153 29514 115354609 83 Linux
> > /dev/sda4 29514 30402 7135394+ b W95 FAT32
> I won’t do anything else until you’ve had a chance to look at this.
> Thanks, Loni.
>
> PS. There’s just one internal drive and one cd-dvd drive in the
> machine.
>
>

{Grin} Well, no chance of getting this drive accidently mixed up with
another. :slight_smile:

(ok, this sounds stuuupid, but make sure the power and data cables are
securely plugged into the drive (both ends) )

And you’re still getting an error when booting into linux?

Hmmm…

Ok, let’s try this then… I need you to trust me. Do you trust me?

The next commands will test that… (use google and search for the man pages
and read them yourself to verify what I’m going to have you do…)

man e2fsck
http://linux.die.net/man/8/e2fsck
man mke2fs
http://linux.die.net/man/8/mkfs.ext3

Boot the rescue dvd

We need to determine the BLOCK size of your partition:

dumpe2fs /dev/sda3 | grep -i “block size”

I get 4096 for my system, yours is likely the same. Save this number.

=============================

Here’s the scary part… Go find a leather belt to bite on… really.

We’re going to ‘pseudo’ format the drive, to recreate the superblocks.

NO DATA WILL BE LOST

=============================
== The commands… look up the options in the man pages, VERIFY what
== you’ll be doing. If you don’t understand, ASK, email me, or instant
== message me, my screen names are listed in my profile…

== (insert block size in -b option … 4096)

mke2fs -S -b 4096 -v /dev/sda3

e2fsck -y -f -v -C 0 /dev/sda3

tune2fs -j /dev/sd3

=============================

mke2fs rebuilds the superblocks, but zaps the journal, makes it ext2

(capital S, lower b, lower v)

e2fsck rebuilds the inode trees and magic structures of filesystem.

(lower y, lower f, lower v, capital C, zero)

tune2fs puts the journal back into place, making it ext3 again.

(lower j)

=============================

Please read ALL of this and understand what you’re doing before
proceeding. I have just performed this sequence of commands THREE times on a
300+ gig drive, just to make sure it works with no data loss.

If you wish, run the ‘e2fsck /dev/sda3’ again afterwards to verify the
filesystem.

When this completes, type ‘reboot’ in the rescue DVD, allow the system to
shutdown and boot normally, it should be fixed.

Hope this helps.

Loni

(my apologies for this taking so long… I have one of those “remembers
everything” type memories, although it isn’t indexed very well. I remembered
what I wanted to do, but not exactly how, nor exactly what command… had
to grep my head and the machine to find it again. the add on the testing to
make sure I wasn’t going to instantly become your favorite enemy…) {Smile}


L R Nix
lornix@lornix.com

Loni, I believe my system was hanging by a thread so I was overjoyed to receive your very apt aid. I read the man pages for each of the commands and looked at the purpose of the flags and options and then proceeded in full trust.

I hope this was a typo of yours:

tune2fs -j /dev/sd3

I think making me do a little work is appropriate. I typed “tune2fs -j /dev/sda3” instead.

After all that the system rebooted into the same dumb filesystem message

The superblock could not be read or does not describe a correct ext2 filesystem.

So I returned to the DVD rescue prompt and I just now typed “e2fsck /dev/sda3”:

/dev/sda3: clean, 250831/7217152 files, 2568240/28838652 blocks

Is this a case of false negative? Should I be able to use my system for HOURS without a hitch yet have corrupted superblocks? Or should I just compute under Damocles’ sword?

I’ll be willin to read more man pages.

On Wed, 20 Aug 2008 06:56:03 GMT
fogelfish <fogelfish@no-mx.forums.opensuse.org> wrote:

>
> Loni, I believe my system was hanging by a thread so I was overjoyed to
> receive your very apt aid. I read the man pages for each of the
> commands and looked at the purpose of the flags and options and then
> proceeded in full trust.
>
> I hope this was a typo of yours:
> > tune2fs -j /dev/sd3
> I think making me do a little work is appropriate. I typed “tune2fs -j
> /dev/sda3” instead.
>
> After all that the system rebooted into the same dumb filesystem
> message
> > The superblock could not be read or does not describe a correct ext2
> > filesystem.
> So I returned to the DVD rescue prompt and I just now typed “e2fsck
> /dev/sda3”:
> > /dev/sda3: clean, 250831/7217152 files, 2568240/28838652 blocks
> Is this a case of false negative? Should I be able to use my system
> for HOURS without a hitch yet have corrupted superblocks? Or should I
> just compute under Damocles’ sword?
>
> I’ll be willin to read more man pages.
>
>

Yeah! That was it, an intentional typo to keep you in the loop…

Did you typo the error message? or does it really say ‘ext2’?

The system should be fixed and good to go… could you post your
‘/etc/fstab’ contents?

That’s the only thing I can figure is affecting this.

{Grin} I take it that the commands went well.

Loni


L R Nix
lornix@lornix.com

The commands went without a hitch.

I typed the error but it wasn’t a typo. Here’s more of it.

The superblock could not be read or does not describe a correct ext2 filesystem. If the device is valid and it really contains an ext2 filesystem (and not swap or ufs or something else), then the superblock is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device>

bootsplash: status on console 0 changed to on
blogd: no message logging because /var file sytem is not accessible
ehci-hcd ohci-hcd uhci-hcd usb-ohci usb-uhci
fsck failed for at least one filesystem (not /)
Please repair manually and reboot.
The root file system is is already mounted read-write

And here the results of ‘/etc/fstab’:

/dev/disk/by-id/scsi-SATA_Hitachi_HDT7250_VFJ101R73GK09K-part4 swap swap defaults 0 0
/dev/disk/by-id/scsi-SATA_Hitachi_HDT7250_VFJ101R73GK09K-part3 / ext3 acl,user_xattr 1 1
proc /proc proc defaults 0 0
sysfs /sys sysfs noauto 0 0
debugfs /sys/kernel/debug debugfs noauto 0 0
usbfs /proc/bus/usb usbfs noauto 0 0
devpts /dev/pts devpts mode=0620,gid=5 0 0
/dev/disk/by-id/scsi-200d04b341e047a1c-part3 /local ext3 acl,user_xattr 1 2

By the way, did I thank you for working with me on this? Thank you.

Oh my!

Found it.

From your FIRST posting…

> error on stat() /dev/disk/by-id/scsi-200d04b341e047a1c-part3: No such
> file or directory
> fsck.ext3: No such file or directoy while trying to open
> /dev/disk/by-id/scsi-200d04b341e047a1c-part3
> bootsplash: status on console 0 changed to on
> The superblock could not be read or does not describe a correct ext2
> filesystem.

What device is that? Looks like a hard drive that WAS hooked up…

And then we look at your fstab:

>And here the results of ‘/etc/fstab’:
> /dev/disk/by-id/scsi-SATA_Hitachi_HDT7250_VFJ101R73GK09K-part4 swap
> swap defaults 0 0
> /dev/disk/by-id/scsi-SATA_Hitachi_HDT7250_VFJ101R73GK09K-part3 /
> ext3 acl,user_xattr 1 1
> proc /proc proc defaults 0 0
> sysfs /sys sysfs noauto 0 0
> debugfs /sys/kernel/debug debugfs noauto 0 0
> usbfs /proc/bus/usb usbfs noauto 0 0
> devpts /dev/pts devpts mode=0620,gid=5 0 0
> /dev/disk/by-id/scsi-200d04b341e047a1c-part3 /local ext3
> acl,user_xattr 1 2

And WAAAAAY down at the bottom of the file… you see a line trying to mounta
drive / partition that doesn’t EXIST!

Delete that line. The one that says “scsi-200d04b…”

You won’t have a problem next boot.

Loni


L R Nix
lornix@lornix.com

Another thing I noticed:

Based on your fstab you posted:

> > /dev/disk/by-id/scsi-SATA_Hitachi_HDT7250_VFJ101R73GK09K-part4 swap
> > swap defaults 0 0
> > /dev/disk/by-id/scsi-SATA_Hitachi_HDT7250_VFJ101R73GK09K-part3 /
> > ext3 acl,user_xattr 1 1

You’re using sda3 as your root/boot partition, that’s good
and using sda4 as your swap partition…

except:

from the ‘fdisk -l’ output from previous:

> Device Boot Start End Blocks Id System
> /dev/sda1 1 26 204819+ ee EFI GPT
> /dev/sda2 26 15153 121503744 af Unknown
> /dev/sda3 15153 29514 115354609 83 Linux
> /dev/sda4 29514 30402 7135394+ b W95 FAT32

You’ve got sda4 marked as W95 FAT32 type…

If that partition IS being used as swap, you should change the partition type
to type 82. the ‘t’ command from ‘fdisk /dev/sda’.

If it’s NOT supposed to be swap… something needs to change, as the data
WILL be written over if your system runs low on physical ram.

Loni


L R Nix
lornix@lornix.com

So cool! I knock myself! It was staring me in the face and I went “Nah!”. The scsi serial numbers. I could have recognized that the offending one was not the internal drive and I would have recognized the external Firewire drive I disconnected without unmounting.

(I’m assuming if I unmounted it fstab would not have an entry for it. Am I correct in that assumption?)

I don’t know why fdisk gives ‘sda4’ the “W95 Fat32” type. In YaST2 > System -> Expert Partitioner it is marked as “Linux swap”. And when I used “parted /dev/sda prin” it shows up as linux-swap. Is this another case of mistaken identity?

On Wed, 20 Aug 2008 18:46:03 GMT
fogelfish <fogelfish@no-mx.forums.opensuse.org> wrote:

>
> So cool! I knock myself! It was staring me in the face and I went
> “Nah!”. The scsi serial numbers. I could have recognized that the
> offending one was not the internal drive and I would have recognized
> the external Firewire drive I disconnected without unmounting.
>
> (I’m assuming if I unmounted it fstab would not have an entry for it.
> Am I correct in that assumption?)
>
> I don’t know why fdisk gives ‘sda4’ the “W95 Fat32” type. In YaST2 >
> System -> Expert Partitioner it is marked as “Linux swap”. And when I
> used “parted /dev/sda prin” it shows up as linux-swap. Is this another
> case of mistaken identity?
>
>

No idea, but I imagine the fact that you’ve got the funky weird GPT partition
that confuses fdisk could be a factor.

> WARNING: GPT (GUID Partition Table) detected on ‘/dev/sda’! The util
> fdisk doesn’t support GPT. Use GNU Parted.

I’d remove the fstab line for your external firewire drive.

The firewire drive will be automounted when you plug it in, and removed when
you remove it.

Anything in /etc/fstab MUST be available at EVERY boot. So if you’re not
going to keep it plugged in, remove the fstab line and let the automounter do
that for you.

I’ve got four firewire and one usb hard drives which are always connected, so
I have fstab entries for them… my usb flash drives are just automounted as
needed since they aren’t always plugged in.

Loni


L R Nix
lornix@lornix.com
Yay! it’s working!

I think that resolves it. I’ll research GPT and fdisk and its probable mis-identification of swap partitions.

And thank you once again for your expert help!

On Wed, 20 Aug 2008 22:46:03 GMT
fogelfish <fogelfish@no-mx.forums.opensuse.org> wrote:

>
> I think that resolves it. I’ll research GPT and fdisk and its probable
> mis-identification of swap partitions.
>
> And thank you once again for your expert help!
>
>

You’re welcome.


L R Nix
lornix@lornix.com