btrfs Enters Read-Only mode and crashes every few hours

(imgur post of screen showing Errors: https://imgur.com/g3yArlw)

I have a new drive (Western Digital 1TB) that I suspect is experiencing hardware issues. Unless someone here knows why btrfs might be dropping into a Read-Only filesystem every couple of hours?

The system will freeze up, and programs will die off (media playback stops, firefox freezes, then plasma-desktop dies), and remains wholly unresponsive to anything but a hardware shutdown.

Hopefully someone can help verify that this is a hardware failure, and not due to my own negligence or mistake!

Thank you for the time.

Hi orbvs,

To check the drive for hardware issues you can use the SMART tools. Assuming that drive is the only one it may be “sda”. So try as root:

smartctl -a /dev/sda

You may have to check the ID of your drive if it isn’t sda. There is more information to gather:

man smartctl

Just in case:
https://btrfs.wiki.kernel.org/index.php/Main_Page

kasi042,

The drive is full LUKS, and shows up as /dev/nvme0n1 (with p1 and p2 partitions).

Output of smartctl and BIOS Memtest both say the drive is fine, which is worrying.

SMART Output says the drive has PASSED all tests, but that there is an Error Information Log Entry (1).

It might be that something is broken in my setup, hopefully not too deep. This is the first error of this type that I can find online, and I am not familiar enough with btrfs to attempt btrfsck with the –rescue flag (I would rather reinstall than risk breaking the drive with that flag in any case).

I was able to mount the drive from a rescue USB and read/transfer files by following, among others, the IBM btrfs recovery guide.


# Recovering btrfs subvolumes
~$ lvdisplay                             # to see the correct name for the sub volumes
~$ btrfs subvolume list /<DISK>         # find the @home sub volume (usually id 262)
~$ mkdir /mnt/<HOME>
~$ mount -t btrfs -o subvolid=262 /dev/<DISK> /mnt/<HOME>

I will backup everything while the drive is working just in case. And if the problem persists, I’ll try reinstalling and see if it goes away. There’s still 27 days on the return warranty, so that’s good.

LUKS meaning it’s encrypted?
Maybe there’s a clue. But sorry, I don’t have any! I have never tried that, yet.:expressionless:
I think I’d just try without encryption if just to erase a potential source of failure.
However, 祝你成功!
:wink:

Hi
Add the following grub boot option via YaST -> bootloader;


nvme_core.default_ps_max_latency_us=0

Should sort out the issue :wink:

Is that related to the issue covered in this Blog post about NVME drives?

Thank you for the solution! From looking up that code snippet, it seems indeed it was this issue, at least how others report their SSDs behaving.

I use Full Disk Encryption (FDE) on all laptops, it’s both required and for my own sense of peace.

好像已被解决了,感谢您!

Hi
When I first got my nvme (same as yours but only 250GB), came across a kernel bug report as my system would not even boot (nvme is not a primary bootable drive on my system), long before that blog post :wink: I think you will probably find some threads in this forum with users reporting the same fix…