Problem with disks order after snapshot 20230921

Yes I used the scsi-3600… ID . I hope there won’t be any more changes.

Regards

I’m pretty sure the scsi-3600 ID won’t go away in this century. Which one were you using before?

Before I used the ‘scsi-0HP_LOGICAL_VOLUME_00000000-part1’ because I have a HP Proliant with a hardware raid unit, so it seemed like a good choice.

I see. But imagine you had several such controllers in your system. I reckon it’d be possible that both use HP_LOGICAL_VOLUME_00000000 as “vendor specific” ID of the first RAID array. At least we, as Linux distro, can hardly be sure that this can’t happen. The 3600- ID is required to be unique though.

Have you tried the boot parameter “udev.scsi_symlink_src=V”? It should give you back the lost ID.

What is the model name of both the computer and the raid?

It is a HP Proliant ml350p gen8 with a smart array P420i (raid 5)

@All, an experimental suse-module-tools package with the workaround above is available in my home repo on OBS. Installing this should have the same effect as steps 1.-3., 6. in bsc#1216070, comment 26 (so backup your initrd before installing it!). Currently available for TW only.

I would be glad if people who were affected by the 6.5.4 kernel change (making scsi_mod and sd_mod loadable modules) could give this package a try.

The expected effect is that the ordering of sd devices stabilizes again. Testers welcome (please undo manual workarounds that you may have applied so far). On systems with only NVMe hardware (e.g. laptops with M.2 SSDs) where multipath-tools is not installed (if you aren’t using dm-multipath, you can just uninstall it with “zypper rm multipath-tools”), an additional expected effect is that the system would come up without any SCSI drivers loaded (check /pproc/modules). If an USB stick or some other SCSI device is added to such a system, SCSI modules should be loaded on demand, and the stick should be mounted as usual.

Thanks. It works for me. I’ve added comment 35 describing what I did to test it.

1 Like

[quote=“mwilck, post:74, topic:169324, full:true”]
@All, an experimental suse-module-tools package with the workaround above is available in my home repo on OBS. Installing this should have the same effect as steps 1.-3., 6. in bsc#1216070, comment 26 (so backup your initrd before installing it!). Currently available for TW only. [/quote]

I installed the two rpm and I see that the short names of the disks (sda, sdb, sdc) remain constant after multiple reboot.
Regards

1 Like

The fix for bug 1216070 has been released in snapshot 20231031:

==== suse-module-tools ====
Version update (16.0.37 -> 16.0.38)
Subpackages: suse-module-tools-scriptlets

- Update to version 16.0.38:
  * modprobe.d: use softdep to load sd_mod and sg (boo#1216070)

This fix reflects well on the distro. After the initial won’t-fix, feedback was listened to. Some thought was applied It turns out there is a way to mitigate the problem that has additional benefits. We can have our cake and eat it too. Thank you @mwilck for running with the ball on this one.

2 Likes

Thanks :slight_smile:

Giving back the compliment, this wouldn’t have been possible without your and @phil524’s testing and feedback efforts.

2 Likes

Any chance this fix finds its way to SLES15 SP5? We’ve been fighting this issue for over a year - ever since kernel 5.2 I believe. We build 3 logical drive installations using autoyast unattended install. The 3 logical drives are all RAID arrays and different sizes. They absolutely must be discovered in order for the installation to be correct. Additionally, on every subsequent boot the order must remain the same. Mountby disk-id, uuid, devicename, makes no difference although I do use devicename and can force the order to be correct
every time.

This has been accomplished with a kernel boot parameter:

scsi_mod.disable_async_probing=[drivername]
where drivername is mptsas for a vmware vm, smartpqi for an HP Smart Array controller, and megaraid_sas for HP MR (broadcom) Raid Controllers.

The OS must first be installed just to identify the driver with “lsscsi -H”, then reinstalled, this time manually entering the kernel boot parmeter including driver name and the path to autoyast file. The unattended install xml files contain the appropriate kernel boot parameter so it is installed and subsequent reboots always work.

I’ve been looking for a common kernel boot parameter that will always work for any driver:
scsi_mod.scan=sync doesn’t work
scsi_mod.async_probe=0 doesn’t work

So for now, I have different xml’s for vm install, HP SR controllers, and HP MR controllers.
But we are unable to support install using our xml’s on any hardware because we can’t know what driver will be used.

Were the scsi-module-tools rpm mentioned in this forum to be installed on SLES15 SP5, this might resolve the issue for subsequent reboots. But what about initial installations? How to install any distribution on 3 logical drives hoping that they will be discovered in the order that you created them?

This is also an issue for Disaster Recovery software as they create their own boot media and MUST discover the 3 logical drives in the same order or recovery does not work.

I realize this is not a SLES forum so ignore my questions if necessary.

BIOS or VM in this case controls order of discovery of drives kernel has nothing to do about it. Partitions are ID’d by OS via UUID or labels

@sdrake You could also try posting here: SUSE and Rancher Forum or might be time to raise a Support Request via SUSE Customer Center…

Unfortunately, not true with the HPE Proliant Gen11 (ML350, DL385). For months I have been testing 1,2, and 3 logical drive configurations of this hardware with e208 raid controller to MSA2060, MR408i-o to internal sas drives, Broadcom 9580-89i8e to internal SAS drives, HPE SR932i-p to internal SAS drives. In all cases, SLES15 SP4 and SP5 installer assignment of sda, sdb, sdc is random UNLESS I specify scsi_mod.disable_async_probing=smartpqi,megaraid_sas when I boot the SLES media. The SLES installer does not let the user change the sda, sdb, and sdc assignments…so you must reboot, use the boot parameter above, when you get it right, the installer will assign logical drive 1 to sda, logical drive 2 to sdb, logical drive 3 to sdc. You can mountby anything you want; uuid, devicename, by-id, etc. Doesn’t matter. With the MR controllers, the logical drive numbers start at 239 and decrement, so additional hacks are required to make this work. Eventually some major SLES customer will run into this and demand it be fixed. Meanwhile I just post everywhere and wait.

This was not a problem with SLES11, SLES12, and SLES15 SP4 when it was first released. It started happening with SLES12 SP5 (on HP Gen10 hardware) and SLES15 SP4 (on Gen11 hardware) after some patch update to these service packs. It continues with SLES15 SP5.

@sdrake So why not raise a Support Request, be ‘that’ customer?

I don’t have a SLES Support Agreement. Also, through kernel boot parameters, I have a work around. I’m just looking to make it universal so our xml autoyast files will work on any hardware. If @gogalthorp is right, and the SLES installer is not responsible for logical drive/scsi device assignment, then this becomes a HP uefi/bios problem…which I could never get past HPE level 1 support.

@sdrake then probably need to wait for the next Update ISO appears and check again…