Unabel to mount partition via UUID or Label after upgrade to 42.2

Yesterday I’ve upgraded from 42.1 to 42.2 ad just after finishing the upgrade procedure I’ve tried to reboot but the machine hang for 1-2 min and then went into emergency target. I’ve investigated the journal log and found out that during the boot my storage disk uuid (/dev/disk/by-uuid can’t be found. So I commented out this line and the boot went on normally.

Currently, it looks like I can’t mount this disk via UUID or LABEL, when I mount it via path /dev/sda1, then the mount is working also after reboot.


dporobic@linux:~> sudo xfs_admin -u /dev/sda1 
UUID = 95de00fa-9988-4088-a7e6-7939d8a4930e


dporobic@linux:~> ls /dev/disk/by-uuid/
lrwxrwxrwx 1 root root 10 Feb  5 12:35 40ccc13a-c107-481a-bc13-280a7c397a32 -> ../../sdb2
lrwxrwxrwx 1 root root 10 Feb  5 12:35 47fcfa4c-eee0-481a-abeb-77906c24d9f9 -> ../../sdb3
lrwxrwxrwx 1 root root 10 Feb  5 12:35 78ffa8b3-afb1-41cf-96e2-fadad84a528f -> ../../sdb4
lrwxrwxrwx 1 root root 10 Feb  5 12:35 9A86E07F86E05D6F -> ../../sdc2
lrwxrwxrwx 1 root root 10 Feb  5 12:35 C829-505C -> ../../sdb1

When I manually create a symlink in /dev/disk/by-uuid and add the UUID to /etc/fstab and then mount via mount -a it works, but after rebooting my entry is gone and I’m again in emergency mode.

Any idea how to troubleshoot this further?

Anyone?
Any idea what’s creating those /dev/disk/by-uuid entres?

UUID is a random string assigned to each and every partition. It is done at creation. So if you recreate partitions they will have different UUID

you can see the values current in /dev/dis/by-uuid directory. These are just links to the /dev/sdX# files which point to the device/partition. the sdX# of a partition can change if you plug in a drive or other factors and is assigned by the BIOS and may change between boots and normally a UUID does not change unless you recreate the partition

I do know how UUIDs work but the strange think here is that after upgrading one of the partitions (single partition on that particular disk) had no entry in /dev/disks/by-uuid. The UUID was not lost, the device has still a UUID, I even tried to re-create a new UUID (with xfs_admin) but the /dev/disk/by-uuid entry was still missing. I’ve tried to create a /dev/disk/by-uuid manually and it worked but after reboot the entry was gone. So the problem is not the UUDI but the missing /dev/disk/by-uuid entry which is where the OS is looking for the mapping and if it doesn’t find it can’t mount the partition.

The /dev/disks/by-uuid/ symlinks are created on boot (or whenever a drive is detected) by udev.
The corresponding rule is /usr/lib/udev/rules.d/60-persistent-storage.rules, on my 13.2 system at least.

No idea why it wouldn’t work for one particular drive any more suddenly.
But apparently the rule doesn’t pick it up for some reason.

Tried reading out the UUID via blkid, can’t read it, another XFS partition on a different disk shows up fine:

dporobic@linux:~> sudo blkid /dev/sda1
dporobic@linux:~> sudo blkid /dev/sdb4
/dev/sdb4: UUID="78ffa8b3-afb1-41cf-96e2-fadad84a528f" TYPE="xfs" PARTLABEL="primary" PARTUUID="79de2a16-133a-4ee9-88b2-1ed3f550d942"

The drive that is hosting this partition is marked as zfs_meber, don’t know where this comes from, and the PTUUID looks strange:

dporobic@linux:~> sudo blkid /dev/sdb
/dev/sdb: PTUUID="1a126f2f-ad34-47c6-95a9-c7967b2ffb10" PTTYPE="gpt"
dporobic@linux:~> sudo blkid /dev/sda
/dev/sda: TYPE="**zfs_member**" PTUUID="0009c2b4" PTTYPE="dos"

Here is how the udev rule looks like:

dporobic@linux:~> cat /usr/lib/udev/rules.d/60-persistent-storage.rules 
# do not edit this file, it will be overwritten on update

# persistent storage links: /dev/disk/{by-id,by-uuid,by-label,by-path}
# scheme based on "Linux persistent device names", 2004, Hannes Reinecke <hare@suse.de>

ACTION=="remove", GOTO="persistent_storage_end"

SUBSYSTEM!="block", GOTO="persistent_storage_end"
KERNEL!="loop*|mmcblk*[0-9]|msblk*[0-9]|mspblk*[0-9]|nvme*|hd*|sd*|sr*|vd*|xvd*|bcache*|cciss*|dasd*|ubd*|scm*|pmem*", GOTO="persistent_storage_end"

# ignore partitions that span the entire disk
TEST=="whole_disk", GOTO="persistent_storage_end"

# for partitions import parent information
ENV{DEVTYPE}=="partition", IMPORT{parent}="ID_*"

# virtio-blk
KERNEL=="vd*!0-9]", ATTRS{serial}=="?*", ENV{ID_SERIAL}="$attr{serial}", SYMLINK+="disk/by-id/virtio-$env{ID_SERIAL}"
KERNEL=="vd*[0-9]", ATTRS{serial}=="?*", ENV{ID_SERIAL}="$attr{serial}", SYMLINK+="disk/by-id/virtio-$env{ID_SERIAL}-part%n"

# ATA
KERNEL=="sd*!0-9]|sr*", ENV{ID_SERIAL}!="?*", SUBSYSTEMS=="scsi", ATTRS{vendor}=="ATA", IMPORT{program}="ata_id --export $devnode"

# ATAPI devices (SPC-3 or later)
KERNEL=="sd*!0-9]|sr*", ENV{ID_SERIAL}!="?*", SUBSYSTEMS=="scsi", ATTRS{type}=="5", ATTRS{scsi_level}=="[6-9]*", IMPORT{program}="ata_id --export $devnode"

# Run ata_id on non-removable USB Mass Storage (SATA/PATA disks in enclosures)
KERNEL=="sd*!0-9]|sr*", ENV{ID_SERIAL}!="?*", ATTR{removable}=="0", SUBSYSTEMS=="usb", IMPORT{program}="ata_id --export $devnode"

# Fall back usb_id for USB devices
KERNEL=="sd*!0-9]|sr*", ENV{ID_SERIAL}!="?*", SUBSYSTEMS=="usb", IMPORT{builtin}="usb_id"

# SCSI devices
KERNEL=="sd*!0-9]|sr*", ENV{ID_SERIAL}!="?*", IMPORT{program}="scsi_id --export --whitelisted -d $devnode", ENV{ID_BUS}="scsi"
KERNEL=="cciss*", ENV{DEVTYPE}=="disk", ENV{ID_SERIAL}!="?*", IMPORT{program}="scsi_id --export --whitelisted -d $devnode", ENV{ID_BUS}="cciss"
KERNEL=="nvme*", ENV{DEVTYPE}=="disk", ENV{ID_SERIAL}!="?*", IMPORT{program}="scsi_id --export --whitelisted -d $tempnode", ENV{ID_BUS}="nvme"
KERNEL=="sd*|sr*|cciss*|nvme*", ENV{DEVTYPE}=="disk", ENV{ID_SERIAL}=="?*", SYMLINK+="disk/by-id/$env{ID_BUS}-$env{ID_SERIAL}"
KERNEL=="sd*|cciss*|nvme*", ENV{DEVTYPE}=="partition", ENV{ID_SERIAL}=="?*", SYMLINK+="disk/by-id/$env{ID_BUS}-$env{ID_SERIAL}-part%n"

# scsi compat links for ATA devices
KERNEL=="sd*!0-9]", ENV{ID_BUS}=="ata", PROGRAM="scsi_id --whitelisted --replace-whitespace -p0x80 -d $devnode", RESULT=="?*", ENV{ID_SCSI_COMPAT}="$result", SYMLINK+="disk/by-id/scsi-$env{ID_SCSI_COMPAT}"
KERNEL=="sd*[0-9]", ENV{ID_SCSI_COMPAT}=="?*", SYMLINK+="disk/by-id/scsi-$env{ID_SCSI_COMPAT}-part%n"

# scsi compat links for ATA devices (for compatibility with udev < 184)
KERNEL=="sd*!0-9]", ENV{ID_BUS}=="ata", PROGRAM="scsi_id --truncated-serial --whitelisted --replace-whitespace -p0x80 -d$tempnode", RESULT=="?*", ENV{ID_SCSI_COMPAT_TRUNCATED}="$result", SYMLINK+="disk/by-id/scsi-$env{ID_SCSI_COMPAT_TRUNCATED}"
KERNEL=="sd*[0-9]", ENV{ID_SCSI_COMPAT_TRUNCATED}=="?*", SYMLINK+="disk/by-id/scsi-$env{ID_SCSI_COMPAT_TRUNCATED}-part%n"

# FireWire
KERNEL=="sd*!0-9]|sr*", ATTRS{ieee1394_id}=="?*", SYMLINK+="disk/by-id/ieee1394-$attr{ieee1394_id}"
KERNEL=="sd*[0-9]", ATTRS{ieee1394_id}=="?*", SYMLINK+="disk/by-id/ieee1394-$attr{ieee1394_id}-part%n"

# MMC
KERNEL=="mmcblk[0-9]", SUBSYSTEMS=="mmc", ATTRS{name}=="?*", ATTRS{serial}=="?*", \
  ENV{ID_NAME}="$attr{name}", ENV{ID_SERIAL}="$attr{serial}", SYMLINK+="disk/by-id/mmc-$env{ID_NAME}_$env{ID_SERIAL}"
KERNEL=="mmcblk[0-9]p[0-9]", ENV{ID_NAME}=="?*", ENV{ID_SERIAL}=="?*", SYMLINK+="disk/by-id/mmc-$env{ID_NAME}_$env{ID_SERIAL}-part%n"

# Memstick
KERNEL=="msblk[0-9]|mspblk[0-9]", SUBSYSTEMS=="memstick", ATTRS{name}=="?*", ATTRS{serial}=="?*", \
  ENV{ID_NAME}="$attr{name}", ENV{ID_SERIAL}="$attr{serial}", SYMLINK+="disk/by-id/memstick-$env{ID_NAME}_$env{ID_SERIAL}"
KERNEL=="msblk[0-9]p[0-9]|mspblk[0-9]p[0-9]", ENV{ID_NAME}=="?*", ENV{ID_SERIAL}=="?*", SYMLINK+="disk/by-id/memstick-$env{ID_NAME}_$env{ID_SERIAL}-part%n"

# by-path
ENV{DEVTYPE}=="disk", DEVPATH!="*/virtual/*", IMPORT{builtin}="path_id"
ENV{DEVTYPE}=="disk", ENV{ID_PATH}=="?*", SYMLINK+="disk/by-path/$env{ID_PATH}"
ENV{DEVTYPE}=="partition", ENV{ID_PATH}=="?*", SYMLINK+="disk/by-path/$env{ID_PATH}-part%n"

# by-path (parent device path, compat version, only for ATA/NVMe/SAS bus)
ENV{DEVTYPE}=="disk", ENV{ID_BUS}=="ata|nvme|scsi", DEVPATH!="*/virtual/*", IMPORT{program}="path_id_compat %p"
ENV{DEVTYPE}=="disk", ENV{ID_PATH_COMPAT}=="?*", SYMLINK+="disk/by-path/$env{ID_PATH_COMPAT}"
ENV{DEVTYPE}=="partition", ENV{ID_PATH_COMPAT}=="?*", SYMLINK+="disk/by-path/$env{ID_PATH_COMPAT}-part%n"

# probe filesystem metadata of optical drives which have a media inserted
KERNEL=="sr*", ENV{DISK_EJECT_REQUEST}!="?*", ENV{ID_CDROM_MEDIA_TRACK_COUNT_DATA}=="?*", ENV{ID_CDROM_MEDIA_SESSION_LAST_OFFSET}=="?*", \
  IMPORT{builtin}="blkid --offset=$env{ID_CDROM_MEDIA_SESSION_LAST_OFFSET}"
# single-session CDs do not have ID_CDROM_MEDIA_SESSION_LAST_OFFSET
KERNEL=="sr*", ENV{DISK_EJECT_REQUEST}!="?*", ENV{ID_CDROM_MEDIA_TRACK_COUNT_DATA}=="?*", ENV{ID_CDROM_MEDIA_SESSION_LAST_OFFSET}=="", \
  IMPORT{builtin}="blkid --noraid"

# probe filesystem metadata of disks
KERNEL!="sr*", IMPORT{builtin}="blkid"

# by-label/by-uuid links (filesystem metadata)
ENV{ID_FS_USAGE}=="filesystem|other|crypto", ENV{ID_FS_UUID_ENC}=="?*", SYMLINK+="disk/by-uuid/$env{ID_FS_UUID_ENC}"
ENV{ID_FS_USAGE}=="filesystem|other", ENV{ID_FS_LABEL_ENC}=="?*", SYMLINK+="disk/by-label/$env{ID_FS_LABEL_ENC}"

# by-id (World Wide Name)
ENV{DEVTYPE}=="disk", ENV{ID_WWN_WITH_EXTENSION}=="?*", SYMLINK+="disk/by-id/wwn-$env{ID_WWN_WITH_EXTENSION}"
ENV{DEVTYPE}=="partition", ENV{ID_WWN_WITH_EXTENSION}=="?*", SYMLINK+="disk/by-id/wwn-$env{ID_WWN_WITH_EXTENSION}-part%n"

# by-partlabel/by-partuuid links (partition metadata)
ENV{ID_PART_ENTRY_SCHEME}=="gpt", ENV{ID_PART_ENTRY_UUID}=="?*", SYMLINK+="disk/by-partuuid/$env{ID_PART_ENTRY_UUID}"
ENV{ID_PART_ENTRY_SCHEME}=="gpt", ENV{ID_PART_ENTRY_NAME}=="?*", SYMLINK+="disk/by-partlabel/$env{ID_PART_ENTRY_NAME}"

# add symlink to GPT root disk
ENV{ID_PART_ENTRY_SCHEME}=="gpt", ENV{ID_PART_GPT_AUTO_ROOT}=="1", SYMLINK+="gpt-auto-root"

LABEL="persistent_storage_end"

First check the drive with smartctl

second run fsck against that partition.

I agree I have no idea why a single partition UUID is not being seen unless the block that it resides on is corrupted. UUID would live in the first block of the partition

The smartctl shows some “pre-fail” attributes, any idea what those mean? I have run the xfs_repair instead of fsck as it’s an xfs partition, looks ok I guess.

dporobic@linux:~> sudo smartctl -H /dev/sda   
smartctl 6.2 2013-11-07 r3856 [x86_64-linux-4.4.36-8-default] (SUSE RPM)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

dporobic@linux:~> 
dporobic@linux:~> sudo smartctl --attributes --log=selftest /dev/sda
smartctl 6.2 2013-11-07 r3856 [x86_64-linux-4.4.36-8-default] (SUSE RPM)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   111   099   006    Pre-fail  Always       -       40813183
  3 Spin_Up_Time            0x0003   100   100   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   096   096   020    Old_age   Always       -       4143
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   083   060   030    Pre-fail  Always       -       208655727
  9 Power_On_Hours          0x0032   090   090   000    Old_age   Always       -       8760
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   097   097   020    Old_age   Always       -       3807
183 Runtime_Bad_Block       0x0032   008   008   000    Old_age   Always       -       92
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   099   000    Old_age   Always       -       1
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   073   055   045    Old_age   Always       -       27 (Min/Max 22/28)
194 Temperature_Celsius     0x0022   027   045   000    Old_age   Always       -       27 (0 11 0 0 0)
195 Hardware_ECC_Recovered  0x001a   025   020   000    Old_age   Always       -       40813183
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       257405980002648
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       2842830018
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       1269687149

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      8760         -
# 2  Extended offline    Completed without error       00%      1522         -
# 3  Short offline       Completed without error       00%         0         -


dporobic@linux:~> sudo xfs_repair /dev/sda1
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
done