Booting problems

Using XFCE. After today’s update I get multiple repeated errors on booting, and my /dev/sda1 does not mount. Smartctl indicated both drives sda and sdb are fine.

Jan 26 17:18:17 localhost.localdomain kernel: I/O error, dev sda, sector 1237321728 op 0x0:(READ) flag>
Jan 26 17:18:17 localhost.localdomain kernel: sd 4:0:0:0: [sda] tag#6 FAILED Result: hostbyte=DID_OK d>
Jan 26 17:18:17 localhost.localdomain kernel: sd 4:0:0:0: [sda] tag#6 Sense Key : Illegal Request [cur>
Jan 26 17:18:17 localhost.localdomain kernel: sd 4:0:0:0: [sda] tag#6 Add. Sense: Unaligned write comm>
Jan 26 17:18:17 localhost.localdomain kernel: sd 4:0:0:0: [sda] tag#6 CDB: Read(10) 28 00 32 80 08 00 
Jan 26 17:18:17 localhost.localdomain kernel: ata5.00: exception Emask 0x50 SAct 0xffffffff SErr 0x308>
Jan 26 17:18:17 localhost.localdomain kernel: ata5.00: irq_stat 0x00400000, PHY RDY changed
Jan 26 17:18:17 localhost.localdomain kernel: ata5: SError: { RecovComm HostInt PHYRdyChg PHYInt }
Jan 26 17:18:17 localhost.localdomain kernel: ata5.00: failed command: READ FPDMA QUEUED
Jan 26 17:18:17 localhost.localdomain kernel: ata5.00: cmd 60/08:00:00:08:00/00:00:5e:00:00/40 tag 0 n>
                                                       res 40/00:68:00:08:40/00:00:61:00:00/40 Emask 0>
Jan 26 17:18:17 localhost.localdomain kernel: ata5.00: status: { DRDY }

Jan 26 17:19:00 localhost.localdomain kernel: ata5: SATA link down (SStatus 0 SControl 300)
Jan 26 17:19:03 localhost.localdomain su[2066]: The gnome keyring socket is not owned with the same cr>
Jan 26 17:19:03 localhost.localdomain su[2066]: gkr-pam: couldn't unlock the login keyring.

What is my next move?

You can try to select to load the machinne with a rollback snapshot at the grub boot menu. Test this to see how everything is functioning as it is only a readme snapshot. If it is operating well with this rollback you can pass the following to create a single snapshot to boot the machine with by (root-privilege): snapper rollback -d "Your-description-of-rollback"

That might work if I were using snapshots, which I am not.

If this install is more than two months old, you should have 1 or 2 prior kernels to try booting with. How do they do?

Post complete, not truncated, log lines.

As instructed…Jan 27 08:56:11 localhost kernel: Linux version 5.14.21-150500.55.44-default (ge - Pastebin.com

I tried with the previous kernel. Results are the same.

No, I asked you to post complete, non-truncated, lines, not full output of dmesg (although this now includes also your previous output).

You should very well be aware that this forum prefers https://paste.opensuse.org/

Anyway, this sounds like a hardware issue (most funny is sense Unaligned write command for a READ command). My only idea would be to disable command queueing (check /sys/block/sda/device/queue_depth and write 1 there if the value is anything else).

It sounds even more like hardware issue and update may just be coincidence.

Thanks for the advice. /sys/block/sda/device/queue_depth had 32 as a value. When I changed it to 1 I still got the same error but the drive did mount. Sorry I used pastebin and I will try to remember to use paste.opensuse.org. As for it being a hardware issue, I get no errors booting Debian or MX Linux and Spinrite tells me the drives are fine and the motherboard is hardly a year old. I am going to re-install using another brand new drive. I really don’t like to re-install without understanding what’s causing the problem.

@ionmich Could also be the SATA cable… are the SATA cables locking versions?

Thanks for the suggestion. I’ll change all my SATA cables.

My fstab contains the line
UUID=079a2a57-305c-4b38-a687-689cba6e94cc /home/ion/DATA ext4 defaults 0 3
and that results in the errors. But if I remove it, then boot, and then
mount UUID=079a2a57-305c-4b38-a687-689cba6e94cc /home/ion/DATA manually I get no errors. I find that really strange.

Check with man fstab about that last character. The parser might be having a problem with it that error handling botches.

Try replacing the final 3 in /etc/fstab entry with 0 to deactivate fsck on boot. Does it change anything during reboot?

Changed cables. No change.

I’ll try that. But it’s been 3 for multiple boots before the problem started. Why should it change?

Did that. Won’t boot. I need a break from all this. Bye for now.

@ionmich when you come back, perhaps running a fsck on the disk may glean more info?

Try changing it to
UUID=079a2a57-305c-4b38-a687-689cba6e94cc /home/ion/DATA ext4 nofail 0 2

I took the coward’s route. I was able to get all my data off the spinning drive, and I reinstalled on a new SSD. Subsequent tests with Spinrite showed errors on the spinning drive, but I don’t think that was the only problem. I had bought several (discounted) power supplies that came with IDE power connectors. I used adapter cables to convert to SATA. I never liked IDE power connectors and adapters even less. I re-seated all the adapters. Next move is to remove all the IDE connectors and solder SATA connectors. But for the last 48 hours everything has been working. Thanks to all for the help.

1 Like