Can't mount partitions

Hi people,

my working machine leaves me with the emergency console and won’t mount any partition. It happened after i had to press the reset button because of a hang up.

journal says ext4 is an unknown filesystem type, it also mentions it was unable to load a kernel module… although don’t know wich one.

fdisk -l says a secondary disk has a GPT PMBR size problem but the backup appears to be ok.

I’m typing this from a borrowed notebook, so i’ll try to post fdisk & journal output as soon as i reach my machine.

Any help will be appreciated.

Regards

Maybe a drive failure

From a rescue disk run smartctl

Man smartctl will get you the manual pages

Also try mounting from the rescue disk report any errors

Hi again,

thanks for your answer, booting from rescue disk leaves me empty, can’t even see the devices. Anyway, i found an old gparted live CD and used it to boot and get the reports.
Smartctl

Fdisk

Journal

lsmod


Module                  Size  Used bybtrfs                1114112  1 
xor                    20480  1 btrfs
raid6_pq              106496  1 btrfs
sr_mod                 24576  0 
cdrom                  61440  1 sr_mod
sd_mod                 53248  3 
hid_generic            16384  0 
usbhid                 53248  0 
crc32c_intel           24576  1 
i915                 1318912  1 
i2c_algo_bit           16384  1 i915
ahci                   36864  2 
ehci_pci               16384  0 
libahci                36864  1 ahci
xhci_pci               16384  0 
drm_kms_helper        155648  1 i915
syscopyarea            16384  1 drm_kms_helper
ehci_hcd               81920  1 ehci_pci
sysfillrect            16384  1 drm_kms_helper
xhci_hcd              192512  1 xhci_pci
sysimgblt              16384  1 drm_kms_helper
fb_sys_fops            16384  1 drm_kms_helper
libata                270336  2 ahci,libahci
usbcore               270336  5 ehci_hcd,ehci_pci,usbhid,xhci_hcd,xhci_pci
drm                   393216  3 i915,drm_kms_helper
usb_common             16384  1 usbcore
video                  40960  1 i915
button                 16384  1 i915
sg                     40960  0 
scsi_mod              262144  4 sg,libata,sd_mod,sr_mod
autofs4                45056  2 

It seems there isn’t an ext4 module, so it can be the problem… but don’t know. Partitions seems to be ok from Gparted live, although havn’t checked the RAID yet.

If is there any other thing i could provide please feel free to request it.

Regards.

Did not mention it was a SSD. SSD have different failure modes then HD.

Also a RAID. It helps to share this kind of info, none here are clairvoyant :open_mouth:

lsmod only shows info from the running OS which I assume is the rescue disk???

Can you mount the partition from the rescue??

I’m sorry, your’re right i should have mentioned it. I assummed it was a software problem not a physical one.
Did not mention RAID because my main concern were the other partitions, my bad again, i should have been more specific.

lsmod only shows info from the running OS which I assume is the rescue disk???

I’ve chrooted to /dev/sda6 (which was old /) …shouldn’t be lsmod from the machine in that case?. lsmod from live CD has much more modules running.

Can you mount the partition from the rescue??

Yes, i can mount sda6(btrfs), sda7 & sdb1 (both ext4) partitions with the live CD, didn’t tried on RAID (sdb2, sdc)because i could deal with it later, even with some loss.

Partitions content seems to be OK, i used Gparted to verify those 3 and it seems there’s no problem… but don’t know how to make the kernel recognize them again, so the boot process completes.

Sorry i didn’t mention those details before, i really appreciate your help.

Regards

Ok chroot. another good detail

Partitions mount that is good. It bothers me that the main GPT partition table was corrupt. Have you run ckdsk against them??

Check /etc/fstab and be sure that looks ok and not corrupted. If one partition/array does not mount the boot fails.

I’ve verified them with Gparted live CD … althought i don’t have the logs. Running manually from live cd console


root@debian:~# btrfs check --repair /dev/sda6
enabling repair mode
Checking filesystem on /dev/sda6
UUID: a0eb3d90-ff91-4a22-b848-5e9e25389163
cache and super generation don't match, space cache will be invalidated
found 35044495360 bytes used err is 0
total csum bytes: 29984660
total tree bytes: 1756069888
total fs tree bytes: 1648558080
total extent tree bytes: 66174976
btree space waste bytes: 291871281
file data blocks allocated: 169988112384
 referenced 98030682112

root@debian:~# e2fsck -pfv /dev/sda7
      201701 inodes used (2.72%, out of 7421952)
        1517 non-contiguous files (0.8%)
         101 non-contiguous directories (0.1%)
             # of inodes with ind/dind/tind blocks: 0/0/0
             Extent depth histogram: 201274/258
    16670595 blocks used (56.20%, out of 29665536)
           0 bad blocks
           7 large files


      170373 regular files
       31135 directories
           0 character device files
           0 block device files
           0 fifos
           0 links
         180 symbolic links (157 fast symbolic links)
           4 sockets
------------
      201692 files

root@debian:~# e2fsck -pfv /dev/sdb1
       21522 inodes used (0.07%, out of 30531584)
        2486 non-contiguous files (11.6%)
          35 non-contiguous directories (0.2%)
             # of inodes with ind/dind/tind blocks: 0/0/0
             Extent depth histogram: 21411/103
    53284393 blocks used (43.64%, out of 122094587)
           0 bad blocks
          13 large files


       20261 regular files
        1252 directories
           0 character device files
           0 block device files
           0 fifos
           0 links
           0 symbolic links (0 fast symbolic links)
           0 sockets
------------
       21513 files

The other partitions belongs to the RAID setup, i could try to reassembly the array but most probably it will fail.

Check /etc/fstab and be sure that looks ok and not corrupted. If one partition/array does not mount the boot fails.

Fstab content


UUID=a0eb3d90-ff91-4a22-b848-5e9e25389163 / btrfs defaults 0 0
UUID=a0eb3d90-ff91-4a22-b848-5e9e25389163 /boot/grub2/i386-pc btrfs subvol=@/boot/grub2/i386-pc 0 0
UUID=a0eb3d90-ff91-4a22-b848-5e9e25389163 /boot/grub2/x86_64-efi btrfs subvol=@/boot/grub2/x86_64-efi 0 0
UUID=a0eb3d90-ff91-4a22-b848-5e9e25389163 /opt btrfs subvol=@/opt 0 0
UUID=a0eb3d90-ff91-4a22-b848-5e9e25389163 /srv btrfs subvol=@/srv 0 0
UUID=a0eb3d90-ff91-4a22-b848-5e9e25389163 /tmp btrfs subvol=@/tmp 0 0
UUID=a0eb3d90-ff91-4a22-b848-5e9e25389163 /usr/local btrfs subvol=@/usr/local 0 0
UUID=a0eb3d90-ff91-4a22-b848-5e9e25389163 /var/crash btrfs subvol=@/var/crash 0 0
UUID=a0eb3d90-ff91-4a22-b848-5e9e25389163 /var/lib/libvirt/images btrfs subvol=@/var/lib/libvirt/images 0 0
UUID=a0eb3d90-ff91-4a22-b848-5e9e25389163 /var/lib/mailman btrfs subvol=@/var/lib/mailman 0 0
UUID=a0eb3d90-ff91-4a22-b848-5e9e25389163 /var/lib/mariadb btrfs subvol=@/var/lib/mariadb 0 0
UUID=a0eb3d90-ff91-4a22-b848-5e9e25389163 /var/lib/mysql btrfs subvol=@/var/lib/mysql 0 0
UUID=a0eb3d90-ff91-4a22-b848-5e9e25389163 /var/lib/named btrfs subvol=@/var/lib/named 0 0
UUID=a0eb3d90-ff91-4a22-b848-5e9e25389163 /var/lib/pgsql btrfs subvol=@/var/lib/pgsql 0 0
UUID=a0eb3d90-ff91-4a22-b848-5e9e25389163 /var/log btrfs subvol=@/var/log 0 0
UUID=a0eb3d90-ff91-4a22-b848-5e9e25389163 /var/opt btrfs subvol=@/var/opt 0 0
UUID=a0eb3d90-ff91-4a22-b848-5e9e25389163 /var/spool btrfs subvol=@/var/spool 0 0
UUID=a0eb3d90-ff91-4a22-b848-5e9e25389163 /var/tmp btrfs subvol=@/var/tmp 0 0
UUID=a0eb3d90-ff91-4a22-b848-5e9e25389163 /var/cache btrfs subvol=@/var/cache 0 0
UUID=a0eb3d90-ff91-4a22-b848-5e9e25389163 /.snapshots btrfs subvol=@/.snapshots 0 0
UUID=cf8f2df2-b313-45ae-b77c-b357f7c539db swap swap defaults 0 0
UUID=87f80c8c-14a5-4d5a-af5e-ead8d8a4189b /home ext4 defaults 0 0
UUID=95eb4e22-8414-4635-bf2b-99ec2b3164b2 /media/secundario ext4 defaults 1 2
UUID=0bd493db-1124-43af-90f7-13776a6a1225 /media/primario ext4 defaults 1 2
UUID=a0eb3d90-ff91-4a22-b848-5e9e25389163 /var/lib/machines btrfs subvol=@/var/lib/machines 0 0

I can’t see anything wrong here (AFAIK), also i’ve found a /etc/.fstab.swp file which seems to be codified somehow and created on February 16.

The UUID to device will be


UUID=a0eb3d90-ff91-4a22-b848-5e9e25389163  --> /dev/sda6 
UUID=87f80c8c-14a5-4d5a-af5e-ead8d8a4189b   --> /dev/sda7
UUID=95eb4e22-8414-4635-bf2b-99ec2b3164b2  --> /dev/sdb1
UUID=0bd493db-1124-43af-90f7-13776a6a1225  --> /dev/md127

Maybe i should remove /dev/md127 entry & reboot?.

Regards

Worth a try any partition failure at boot time stops the boot unless you use the nofail option.

I removed the RAID partition, rebooted and it failed again.
After that i removed all ext4 partitions from fstab and it booted, still got failing services on logs but at least i got root partition working.

Moreover, i manually mount sda7 & sdb1 (both ext4) and it worked, got these lines on logs


BadYuyu kernel: EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: (null)
BadYuyu kernel: EXT4-fs (sda7): mounted filesystem with ordered data mode. Opts: (null)

So, the question is … why doesn’t recognize ext4 filesystem at boot?..it never worked on emergency console either, i was expecting the same.
If fstab is the problem, can it be generated via a command line (ie mdadm -D --scan >> mdadm.conf) ?
Otherwise, what should i look after?

Regards

If you manually mount the possible bad partitions what errors??

Run fsck against any possible bad partitions

For the moment forget the raid until you get the base system working

I’m out of ideas. The bad GPT table points to possible random writes happening.

Please show dmesg and journalctl output before you manually mounted partitions. How exactly did you mount them?

There were no errors when i manually mount the partitions…but on previous boot, automounting failed with message “can’t recognize ext4 filesystem” i was expecting something similar on manual mount, to see it working was a surprise. Fsck says everything is fine on every single partition, no errors detected, all clean.

The good news is …this morning i manually edited fstab to add /home and the other partitions, rebooted and it worked like a charm, i’ve the full system working again (even the **** RAID array, although as md0 instead of md127). Just don’t understand WTH happened, i’ll get an external HDD make a data backup and then blow up the GPT, can’t trust this thing. Maybe later i’ll do something similar on SSD too.

I really wanna thank you for your help, you pointed me in the right direction even when my post were not accurate. Chapeau!!

Didn’t have dmesg log from that boot, i’ve uploaded journal to expire box with full boot sequence since yesterday:

Journal

  1. First boot has complete fstab, you’ll find 'can’t mount /home ’ message, nor any other ext4 partition.
  2. Second boot has fstab with only btrfs filesystem… first succesfull boot in 2 days
  3. Third boot is similar to 2 then manually mounted ext4 partitions with

mount /dev/sda7 /home
mount /dev/sdb1 /media/secundario

  1. Fourth boot is from today when i manually edited fstab to add ext4 partitions
  2. Full working system

Just want to know what happened, i’ll have to read about filesystems & boot proceess in linux, can’t stand with this vague view i have.

Regards