Earlier this week my Tumbleweed, with root partition on btrfs from oS 12.3 through 13.1, finally fell after update “kernel-desktop-3.17.1-52.1.g5c4d099-x86_64” did its worst. Unfortunately, this being the first major Tw or btrfs failure I’ve had, it happens close to the end of its current life cycle.
Previously with kernel-desktop-3.17.1-51.1, it ran trouble-free for several hours, with no relevant error messages (/var/log/messages). Re-booting after the kernel update (3.17.1-52.1), it seemed to proceed normally right through into KDE4’s desktop, but gui applications such as YaST, Dolphin, internet browsers, etc., etc., were all unusable as system files reported to be non-writeable, according to desktop error messages. In other words the root file system (including /home) had become read-only. Command line access through Konsole or tty was possible but limited to query or display commands e.g rpm -q, zypper search or list repos, etc., whereas zypper remove failed or refresh failed repo by repo. The system is effectively rendered useless and unmaintainable!
Mounting the btrfs partition from a standard oS 13.1 system enabled easier investigation with its KDE4 but superuser Dolphin, etc., failed to provide any write access to the partition. Direct chroot access just confirmed the read-only status.
/var/log/messages contained many entries like this one after the initial “BTRFS info” message:
...kernel: 22.259911] BTRFS info (device sda8): disk space caching is enabled
...kernel: 23.318003] parent transid verify failed on 949858304 wanted 186937 found 186939
Running “btrfsck /dev/sda8” (from 13.1) on the unmounted partition, it reported many errors. However, 13.1 doesn’t have latest version of btrfsprogs. Since my Tumbleweed partition includes no important user data, I took the last resort and ran btrfsck --repair. It eventually aborted with this:
Extent back ref already exists for 998006784 parent 24822198272 root 0
Well this shouldn't happen, extent record overlaps but is metadata? [998006784, 4096]
Aborted
Subsequently, I see a relevant bug report at http://bugzilla.opensuse.org/show_bug.cgi?id=897774, and a thread somewhat strangely posted in our Applications forum at https://forums.opensuse.org/showthread.php/501741-btrfs-amp-3-17-kernel-Failsystem-turns-to-readonly.
Apparently this was all a known issue for kernel 3.17 and read-only snapshots. It’s certainly catastrophic for users of btrfs and snapper. Rebooting with previous kernels e.g. 3.16.3 doesn’t solve it.
I still have the corrupted Tumbleweed installed if anyone can suggest a repair? Otherwise I will have to reinstall it, probably over 13.2 release.
“The Tumbleweed is dead. Long live the Tumbleweed (to be regenerated on 4 November)”!