Can't boot after software update

Yesterday, after a normal software update my system crashed, and subsequently I could not boot my machine again. Something seems to have thrashed my root partition. Some folders are seemingly empty (I guess that explains why I cant boot anymore):


du -h --max-depth=1 /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/
23M     /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/etc
0       /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/.snapshots
53M     /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/boot
0       /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/home
0       /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/opt
0       /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/srv
0       /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/tmp
6,8G    /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/usr
509M    /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/var
0       /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/dev
0       /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/proc
0       /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/sys
0       /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/run
5,1M    /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/bin
520M    /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/lib
16M     /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/lib64
11M     /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/root
11M     /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/sbin
0       /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/mnt
0       /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/selinux
7,9G    /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/

However, df reports that most of the harddisk is full - it is a 60 GB SSD-drive:


df -h /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/
Filsystem     Størr  Brugt  Tilb Brug% Monteret på
/dev/sda2       54G   47G     0  100% /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0

I tried running

btrfs check

on the drive using a rescue CD, but that did not seem to detect any problems. I also tried to boot from an older snapshot without any luck.

I have installed Leap 42.1 on a new, external HD, which works fine. However I would really like to restore my old /home directory. How can I do that?

Not really.
Those empty folders are supposed to be empty (on a default installation at least).

Impossible to say if something critical is missing, but the numbers don’t look obviously wrong.

What exactly happens when you try to boot?

However, df reports that most of the harddisk is full - it is a 60 GB SSD-drive:

df -h /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0/
Filsystem Størr Brugt Tilb Brug% Monteret på
/dev/sda2 54G 47G 0 100% /run/media/claus/ec5e3307-c876-481d-ae0f-2fcbdf10a8e0

Yes, and that’s likely the problem.
Though judging from the numbers of du, it shouldn’t be full really.

I tried running

btrfs check

on the drive using a rescue CD, but that did not seem to detect any problems. I also tried to boot from an older snapshot without any luck.

btrfs check will not free up space.

Try “btrfs balance” instead, see also Marc's Blog: btrfs - Fixing Btrfs Filesystem Full Problems

I have installed Leap 42.1 on a new, external HD, which works fine. However I would really like to restore my old /home directory. How can I do that?

Mount it, /home is by default on a separate partition.
The /etc/fstab from your old system partition should tell exactly.

Not the /home folder, though. I had my home folder in the same partition as the system dirs. I kept all my regular files on an external HD, so they are safe, but all the config-files and other app-specific files has been lost. This unfortunately includes all e-mails.

I know it a good idea to keep /home on a separate partition - I did this originally, but ran into a similar problem a couple of months back. Back then I could copy all data from my home directory, reinstall Leap, and then copy the home directory back. I chose to keep everything in the same partition that time, because it appeared that the problem was caused by insufficient memory on the root partition and I had a lot of free space on the /home partition.

I get to the boot selector, choose a boot option (either the normal boot into Leap or a snapshot), after that I am immediately dropped into a terminal screen, getting just a few lines of output, the last of which is


systemd[1]: Failed to start RealtimeKit Policy Service

The process stops here: no normal login, no desktop, but I can log in to a terminal session.

I was puzzled by the discrepancy - one says I have used less that 8 GB, the other that I have less 8 GB free leaving more than 40 GB unaccounted for.

I ran btrfs check to see if the filesystem had been thrashed, but if I understood the output correctly it was ok.

Thanks for the response. I shall try out your suggestions later today and report back.

Whatever you do, DO NOT TRY TO DELETE SNAPSHOTS #0 and #1 if you intend to recover any data from a btrfs ilesystem. – Personal bitter experience.

Tools like du and df ignore the space occupied by /.snapshots. I do not know how to get a good approximation of the space usage of a btrfs partition with snapshots.

That’s bad.
As mentioned, /home is normally on a separate partition, in this case it would be normal that /home on the root partition would be empty (it is just the mount point).

If /home actually is (was) on your / partition, it looks like your data is gone unfortunately.

I suppose /home is empty too if you try to get a directory listing?
Do you maybe get any errors in “dmesg” if you try to access /home?

I know it a good idea to keep /home on a separate partition - I did this originally, but ran into a similar problem a couple of months back. Back then I could copy all data from my home directory, reinstall Leap, and then copy the home directory back. I chose to keep everything in the same partition that time, because it appeared that the problem was caused by insufficient memory on the root partition and I had a lot of free space on the /home partition.

Well, 40GiB for / is about the lower (recommended) limit when using snapshots, if you also put /home onto it it’s likely too small again, and 60GiB may be too small as well in that case, especially if you snapshot /home too.

I get to the boot selector, choose a boot option (either the normal boot into Leap or a snapshot), after that I am immediately dropped into a terminal screen, getting just a few lines of output, the last of which is

systemd[1]: Failed to start RealtimeKit Policy Service

The process stops here: no normal login, no desktop, but I can log in to a terminal session.

RealtimeKit is used by PulseAudio, so that shouldn’t be critical (you probably loose sound, but it should not prevent the system to boot to a GUI).

To fix this one, you’d need to look why it fails to start:

systemctl status rtkit-daemon

I was puzzled by the discrepancy - one says I have used less that 8 GB, the other that I have less 8 GB free leaving more than 40 GB unaccounted for.

Hm, the discrepancy might be the missing files (from /home in particular).

If your snapshots do include /home, the easiest fix would likely be to revert to a previous snapshot…
You can inspect the snapshots (and its files) with YaST->System->Snapper (“yast snapper”), so maybe have a look if you find your user files in there.

And just to avoid a misunderstanding: “btrfs balance” will only try to make sure that free space is really available for use and shown as free, it will not try to recover your user files.

Please show output of “btrsf subvolume list -a /mount/point/for/btrfs/partition

Sorry for the long absence.

My understanding of the problem was mistaken in several respects. First, while my system no longer boots into a graphical environment I can log into a terminal; and when I do that, I find that my home directory is intact. I have tried to create a backup of the home directory to at usb-stick but have run into a problem: some of the contents of the mail folders can’t be copied, apparently because of the file names:

I am using KMail and as I understand it, mails are stored in the directories under

/home/claus/.local/share/akonadi_maildir_resource_0

Some of files contained in these directories have names ending in ‘:2,S’ and similar, and when trying to copy those, cp reports an error and nothing is copied. It does not help enclosing file names in single or double quotes.

I should really like to secure a copy of my mail before I try to solve the problem.

try to compress (.zip, .tar, …) the directories containing those files and move the compressed files to your backup media.

If there is enough room on your disk you probably can simply compress your complete /home.

Regards

susejunky

Thanks a lot. I now have a full backup of my home directory.

Having secured a copy of my data, I deleted some directories that didn’t seem too important and on the next login everything was back to normal!

Now I am ready to have a go at the btrfs problem.

Following directions in the linked page I did:


btrfs fi show /dev/sda2
Label: none  uuid: ec5e3307-c876-481d-ae0f-2fcbdf10a8e0
        Total devices 1 FS bytes used 30.61GiB
        devid    1 size 53.89GiB used 44.06GiB path /dev/sda2

then


btrfs balance start -dusage=60 / &

and finally


btrfs fi show /dev/sda2
Label: none  uuid: ec5e3307-c876-481d-ae0f-2fcbdf10a8e0
        Total devices 1 FS bytes used 30.64GiB
        devid    1 size 53.89GiB used 38.06GiB path /dev/sda2

Running “btrfs balance” clearly helped. Thanks for your help.