ESP partition ignored after clone SATA M.2 to NVME M.2

# blkid /dev/nvme0n1p1 ; blkid /dev/nvme0n1p7
/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="PI3P01ESP" LABEL="PI3P01ESP" UUID="20A0-1003" TYPE="vfat" PARTLABEL="MuP P01 EFI System  (ESP)" PARTUUID="5e15361e-57dc-4df5-a1ad-9e392a4c1de4"
/dev/nvme0n1p7: LABEL="k25p07stw" UUID="d3996ada-30fb-42d2-a8a2-4d5a32e2aef4" TYPE="ext4" PARTLABEL="MuP P07 openSUSE Tumbleweed" PARTUUID="5e153684-7fa2-4df5-a78c-ed4f2e7a2e96"
# efibootmgr
BootCurrent: 0006
Timeout: 1 seconds
BootOrder: 0005,0003,0004,0006,0000
Boot0000* opensusetw
Boot0003* Hard Drive
Boot0004* CD/DVD Drive
Boot0005* opensuse
Boot0006* UEFI: Generic Flash Disk 8.07
# efibootmgr -v
BootCurrent: 0006
Timeout: 1 seconds
BootOrder: 0005,0003,0004,0006,0000
Boot0000* opensusetw	VenHw(99e275e7-75a0-4b37-a2e6-c5385e6c00cb)
Boot0003* Hard Drive	BBS(HD,,0x0)..GO..NO.........G.e.n.e.r.i.c.-.S.D./.M.M.C. .1...0.0....................A..........................Gd-.;.A..MQ..L.0.5.8.F.6.3.6.4.6.4.7.6........BO..NO........`.G.e.n.e.r.i.c.-.C.o.m.p.a.c.t. .F.l.a.s.h. .1...0.1....................A...............................Gd-.;.A..MQ..L.0.5.8.F.6.3.6.4.6.4.7.6........BO..NO........`.G.e.n.e.r.i.c.-.S.M./.x.D.-.P.i.c.t.u.r.e. .1...0.2....................A...............................Gd-.;.A..MQ..L.0.5.8.F.6.3.6.4.6.4.7.6........BO..NO........`.G.e.n.e.r.i.c.-.M.S./.M.S.-.P.r.o. .1...0.3....................A...............................Gd-.;.A..MQ..L.0.5.8.F.6.3.6.4.6.4.7.6........BO..NO........s.M.K.N.S.S.D.P.L.1.2.0.G.B.-.D.8....................A.......................................6..Gd-.;.A..MQ..L.M.K.N.S.S.D.P.L.1.2.0.G.B.-.D.8........BO..NO........o.S.T.1.0.0.0.D.M.0.0.3.-.1.C.H.1.6.2....................A...........................>..Gd-.;.A..MQ..L. . . . . . . . . . . . .1.S.F.D.2.A.D.T........BO..NO........o.S.T.1.0.0.0.D.M.0.0.3.-.1.C.H.1.6.2....................A...........................>..Gd-.;.A..MQ..L. . . . . . . . . . . . .1.Z.C.D.Z.J.N.8........BO..NO........S.G.e.n.e.r.i.c. .F.l.a.s.h. .D.i.s.k. .8...0.7....................A.......................&..Gd-.;.A..MQ..L.4.E.F.0.4.1.9.8........BO
Boot0004* CD/DVD Drive	BBS(CDROM,,0x0)..GO..NO........o.P.L.E.X.T.O.R. .P.X.-.8.9.1.S.A.F....................A...........................>..Gd-.;.A..MQ..L.5.3.4.2.7.7. .2.M.2.7.8.9.2.0.5.0.1.3.5........BO
Boot0005* opensuse	HD(1,GPT,5e15361e-57dc-4df5-a1ad-9e392a4c1de4,0x800,0xa0000)/File(\EFI\OPENSUSE\GRUBX64.EFI)..BO
Boot0006* UEFI: Generic Flash Disk 8.07	PciRoot(0x0)/Pci(0x14,0x0)/USB(7,0)/CDROM(1,0xac8,0x7a10)..BO

Above is from the problem Gigabyte Kaby Lake PC. I order to migrate the boot device from GPT 256G M.2 SATA stick to a GPT 120G M.2 NVME stick, I did a three way swap, as I wanted to preserve the content of both M.2 devices. Only about the first 40% of the 256G was in use, so I cloned the 256G to a 160G SATA HD, then cloned from the 120G to the 256G, then from the 160G to the 120G.

Having done this, the secondary partition tables on the smaller devices disappeared, while that of the larger became invalid. Consequently, fdisk, parted and apparently the kernel don’t see any partitions. Grub does, and so does my partitioner (DFSee), and apparently so do the disk drivers somehow. I was able to boot into Grub, select TW on nvme0n1p7, and start to boot, but init failed when came time to mount nvme0n1p7 on /. I had apparently forgotten a crucial prerequisite to the changeover - rebuild initrds with nvme support.

So, I wiped the 120G, recreated the partition structure on it, then cloned partition by partition from the 160G. I’m guession the resulting PARTUUID changes apparently disabled the UEFI system from operating as required. The ESP on the 120G is no longer recognized by the BIOS. I tried reordering boot order to start with opensusetw using efibootmgr -o, but it doesn’t stick, as you can see from Boot0005 listed first.

The following are the UUIDs from the 160G, carryovers from the 256G to 160G cloning:

Partition Label  : K25P01ESP     Uuid : {5b2ae91e-e722-4df5-8b9f-93e611fc27ba} # old sdc1
Partition Label  : k25p07stw     Uuid : {da6a99d3-fb30-d242-a8a2-4d5a32e2aef4} # old sdc7

Below is from my Asus Kaby Lake PC, which is configured very little differently from the Gigabyte, exactly the same layout through the first 12 partitions. Not too many weeks ago I successfully migrated it from standard SATA SSD to NVME, but used no off-sized intermediary steps.

# blkid /dev/nvme0n1p1 ; blkid /dev/nvme0n1p7
/dev/nvme0n1p1: SEC_TYPE="msdos" LABEL_FATBOOT="SX6P01ESP" LABEL="SX6P01ESP" UUID="20A0-2A08" TYPE="vfat" PARTLABEL="SX6P01 EFI System (ESP)" PARTUUID="5b331d7f-9488-4df5-9eed-c7250696b833"
/dev/nvme0n1p7: LABEL="sx6p07stw" UUID="d99a8eac-df16-4cc3-8aa2-a57e5dd3008a" TYPE="ext4" PARTLABEL="SX6P07 openSUSE Tumbleweed" PARTUUID="5b331e4b-e409-4df5-a409-456d2fca2f1f"
# efibootmgr
BootCurrent: 0000
Timeout: 1 seconds
BootOrder: 0000,0004,0005,0006
Boot0000* opensusetw
Boot0004* UEFI OS
Boot0005* Hard Drive
Boot0006* CD/DVD Drive
# efibootmgr -v
BootCurrent: 0000
Timeout: 1 seconds
BootOrder: 0000,0004,0005,0006
Boot0000* opensusetw	HD(1,GPT,5b331d7f-9488-4df5-9eed-c7250696b833,0x800,0xa0000)/File(\EFI\OPENSUSETW\GRUBX64.EFI)
Boot0004* UEFI OS	HD(1,GPT,5b331d7f-9488-4df5-9eed-c7250696b833,0x800,0xa0000)/File(\EFI\BOOT\BOOTX64.EFI)..BO
Boot0005* Hard Drive	BBS(HD,,0x0)..GO..NO........k.A.D.A.T.A. .S.X.6.0.0.0.L.N.P....................A..........................................Gd-.;.A..MQ..L.2.J.4.5.2.0.1.7.7.5.8.1........BO
Boot0006* CD/DVD Drive	BBS(CDROM,,0x0)..GO..NO........u.O.p.t.i.a.r.c. .D.V.D. .R.W. .A.D.-.7.2.0.0.S....................A...........................D..Gd-.;.A..MQ..L.O.p.t.i.a.r.c. .D.V.D. .R.W. .A.D.-.7.2.0.0.S........BO

I see quite a bit of difference between the Boot0000s on the two PCs. It seems obvious that at least part of the trouble is the Boot0000 entry on the Gigabyte is now invalid. I’m hoping that the entirety of the problem, and that the next step forward would be to delete Boot0000 and create a new one.

Any thoughts on my assessment? Anything I’ve missed that I should have or could have done, or something else I should do first? Is there some likelihood that the UEFI BIOS is reacting badly because the ESP was partitioned as FAT16, so I should recreate the ESP as FAT32? I’m asking these few questions out of fear of making the situation worse instead of better, and blowing away a lot more time.

Also I’m trying to do this without any help from YaST, so that I can better understand how UEFI works, and how to fix it if it breaks without YaST being available…

Hi
That would be my assumption too, clear out the nvram and redo the efi entries.

But, you can test if your BIOS allows booting from an efi file? Or you should be able to add a efi entry from the BIOS by browsing and pointing at the relevant files.

I’m not sure I’ve seen any way to delete any boot selection(s) from within BIOS. It seems to remember every bootable disk and bootable stick I’ve ever had attached when I use F12 to select a boot device.

But, you can test if your BIOS allows booting from an efi file?
I don’t recall hearing about that possibility before either.

Or you should be able to add a efi entry from the BIOS by browsing and pointing at the relevant files.
Browse how?

Hi
In the BIOS, you can select an efi file, you can in HP systems, from memory in a ASUS you could select to add and it opens a file explorer to navigate to the file your wanting to use on the efi partition. It may even have a built in efi shell. If your running with secure boot disabled, look at adding the edk efi-shell and can get into the bowels of the nvram, some systems allow customizing boot screens etc (HP ProBook for example).

When using UEFI, you have a “secure boot” option.
If that is enabled, I’d expect that works only on the original hardware and can’t be migrated/cloned to another system board.

Otherwise,
I’d expect that if the disk and partition UUIDs can be preserved, I’d expect that the cloned disk should “just work” when installed in new hardware.

(Untested),
TSU

This is probably why my previous migration from SSD to NVME succeeded and I didn’t remember much about it when starting this one. At least one of the installed systems did have NVME support in its kernel, so I booted that and followed up with chroots to rebuild initrds on those that would not boot. Also, I wasn’t migrating to a target half the size of the original.

If the clone went to a disk of different size, run gdisk, write, and quit – to fix the tail header location. You should see a visible warning from gdisk about the tail header location. Sadly GPT disks have become less resilient (wrt cloning) that MBR despite having 2 redundant headers. I’ve brought this up with the parted owner, and it was seemingly vetoed (who needs the best shot at data recovery, who cares right?!). gdisk works, Windows tools work. Maybe the fdisk owner will fix fdisk but I haven’t filed a bug yet.

This is why if you clone a 160G disk to 120G then run the openSUSE installer, it will overwrite your precious data.

# efibootmgr -b 0 -B
# efibootmgr -b 0 -c -d /dev/nvme0n1 -L opensusetw -l /efi/opensusetw/grubx64.efi -p1 -v

output:

BootCurrent: 0006
Timeout: 1 seconds
BootOrder: 0000,0005,0003,0004,0006
Boot0003* Hard Drive    BBS(HD,,0x0)blahblahblah...;.A..MQ..L.4.E.F.0.4.1.9.8........BO
Boot0004* CD/DVD Drive BBS(CDROM,,0x0)..GO..NO...o.P.L.E.X.T.O.R. .P.X.-.8.9.1.S.A.F....................A...........................>..Gd-.;.A..MQ..L.5.3.4.2.7.7. .2.M.2.7.8.9.2.0.5.0.1.3.5........BO
Boot0005* opensuse      HD(1,GPT,5e15361e-57dc-4df5-a1ad-9e392a4c1de4,0x800,0xa0000)/File(\EFI\OPENSUSE\GRUBX64.EFI)..BO
Boot0006* UEFI: Generic Flash Disk 8.07 PciRoot(0x0)/Pci(0x14,0x0)/USB(7,0)/CDROM(1,0xac8,0x7a10)..BO
Boot0000* opensusetw    HD(1,GPT,5e15361e-57dc-4df5-a1ad-9e392a4c1de4,0x800,0xa0000/File(\efi\opensusetw\grubx64.efi)

Grub works. TW again hung at switchroot. Buster booted normally. Chroot from Buster to TW to run mkinitrd fixed TW. BIOS boot order matches efibootmgr output. :smiley:

Thanks for the feedback all!

The message was

Error: Can't have a partition table outside the disk!

after running partprobe, but before rebooting. After reboot, results are as expected. :slight_smile:

Okay, that’s a different (and additional) problem. The first is after cloning a bigger disk to a smaller disk, the tail GPT header is misplaced (which Windows or gdisk will fix, but not fdisk/parted/cfdisk/gparted). But when you clone, you need to make sure the partition data isn’t also lost or trailing off the end of the disk. Nothing is going to recover missing partition data, or a partition that extends past the end of a disk.

For instance, in my case, my target was a 500G disk. But, I did my testing on a 2TB device, where I had only allocated 128GB of partitions. After testing, I cloned the result to the 500G with the intent to expand the partitions. But surprise, fdisk and parted and TW installer see a blank disk, despite sector-wise mirroring. This used to work fine with MBR, but no longer with GPT. You have to manually rewrite the GPT headers with gdisk before continuing, otherwise the TW installer will overwrite the contents of the disk.