XFS corruption and xps_repair problems

So, yesterday I went to lunch and when I got back and tried to unlock my PC, the X server welcomed me with black screen and “x” cursor. I was forced to CTRL+ALT+(BACKSPACE*2) to kill it and try to re-login (thankfully everything was saved before I left). I was surprised to see the same result… :open_mouth: I have decided to reboot the system which ended up in rescue console, due to corrupted /home (/dev/sda7) partition. It is an up to date leap installation with vanilla settings, btrfs for mission critical parts of the system and XFS for data partitions like /home. I was a bit puzzled on what was going on (and I still am). I tried to run xfs_repair, as kindly suggested by the rescue console, but it didn’t seem to work quite well. xfs_repair just hangs, no output at all, no progress, no nothing… and I couldn’t even CTRL+C out of it. I had to force reboot the machine every time I tried to run xfs_repair. For some reason it just hangs there… Thankfully I was able to mount another XFS data partition, that lives on a separate 2TB drive… then I **dd **the **** out of /dev/sda7 to a file safely located on the other drive.

dd if=/dev/sda7 of=sda7.img bs=512k

I was surprised to find that xfs_repair happily repairs the 140GB image (with -L because of dirty logs) without any glitch or hesitation in total of 10 seconds. I had few back thoughts, but at the end I shoved back the image to the drive (with dd of course). Reboot. Yey. Everything worked as if it never failed. Nothing seems to be missing, everything is fine and dandy…

The reason I am writing this post here today is, first of all to let others know a safe way to work around this problem, and secondly, because it happened again this morning, but to my home laptop (also up to date leap installation). Unfortunately xfs_repair behaves the same way, and even worse I have no spare space to dump the partition to a file and fix it. Sooo yeah, now I really need to find out why xfs_repair hangs and how to make it work on the physical drive directly. I tried to lie to it and give it the -f option to assume the drive is a file which at least returned an output (error) stating that “Device or resource is busy”. Hm… if this isn’t handled well without the -f option, this could somehow explain the hanging… However, I have no idea what could be keeping it “busy” in such a way that blocks xfs_repair from doing it’s trick, but doesn’t concern dd at all.

More info:
My office machine is pretty much a brand new monster and the problematic partition is on a 250gb samsung 850 evo ssd drive. As a contrast, the poor laptop at home is 8 years old dell studio 1535 with a WD 320gb mechanical drive. Same OS. Updated on daily basis.

Any ideas? :slight_smile:

to repair the partition must be unmoounted so best done from a repair disk

XFS is usually pretty reliable so you need to figure out why it is being corrupted. 2 machines in so short a time is truly bad luck.

I did check whether the partition is mounted, although if it was, there shouldn’t be a problem any way. umount reported that the partition is indeed not mounted. :expressionless:

I am afraid that this is not a coincidence, but rather a bug that probably arrived with some of the latest updates… But the hanging of **xfs_repair **is a whole other story that I truly would like to resolve.

So far no one else has reported it here that I know of. If you think it is a bug and it maybe report on bugzilla use the same user and password as here

https://en.opensuse.org/Bugzilla

Yes I too have the same problem with xfs_repair I opened a thread yesterday 13/10/16 “failed to recover EFIs”. I don’t understand why I can’t boot into a root session just because there is a problem on the /home partition or device.

By default all mounted in /etv/fstab must be present and working in order to boot. You can tell the system to ignore problem by adding nofail to options. If home is not mounted though a new home directory will be started in / for any use that logs in, it is over loaded (not erased) if you later mount home partition. root’s home is of course in root and normally on the root partition.

Thanks, I have solved it without going to the dd option. When I commented out the /home entry in fstab it booted into root and the xfs_repair command worked.

It is best to do serious repairs from a live repair disk/usb rather then try to work on a running system. Do you change your tries while driving down the road???:stuck_out_tongue:

yes, I accept that point but I had always previously repaired the XFS problem without booting from the DVD as I did eventually this time. Back in the days of SuSe 7.1 It had a repair system option on the CD installation, that too has disappeared.
Why does SuSe keep changing the default fs? It started with Reiserfs then ext 2,3,4 then Brtfs then xfs and maybe I’ve missed some. It’s a pain for us users trying to keep up. I cannot remember any of the reasons for the changes I do remember years ago I used to keep a FAT32 partition for moving files between Windoze and Linux but that had its 2GB file size limit if I remember correctly. Anyway sorry for the rant but today not even Yast will start!
Oh PS. I’m still on 13.2 which I think is too messed up for me to update to Leap:(