Problematic growth of root partition with snapper

Hello,

I have been using snapper on tumbleweed as a security net (it’s great), in case
of instabilities after updates (which happened once). However the root partition is
installed on a small SSD (32GB), and I quickly get to a point where is totally filled,
although it should not.

I am only keeping a few snapshots (4 max), and cleaning regularly logs, so that’s
not the problem. In fact, the whole distribution (outside /home which is on a different
hard disk) is about 12GB in size:
$ sudo du -s /* |sort -n
1636 /run
2208 /bin
11372 /lib64
12392 /sbin
14300 /root
22092 /etc
90404 /opt
123876 /boot
196260 /tmp
1144312 /var
1269384 /lib
10752060 /usr

but df gives:
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sdb1 29157376 24766948 3936428 87% /

Do you have any clue why the root partition is twice bigger than it should?
It seems to have build up gradually after several months of using snapper.

Thanks in any case!

S.

Instead of df or du, on btrfs one should use


btrfs fi usage /


since the kernel cannot get the actual fs usage. This is indeed due to snapper.

With


sudo snapper list

you can see the snapshots
with


sudo snapper remove #from-#to

you can remove one or more snapshots.

Thanks for the tip (I did not know), but anyway df gives a similar estimate as btrfs fi usage /
Of course, I am cleaning up older snapshots regularly, and the extra disk usage due to a few
extra snapshots should be 10% or so more, not 100% more.

I just found an interesting discussion of this problem here:
https://www.reddit.com/r/openSUSE/comments/7cqvno/suse_should_really_have_a_warning_for_when_there/
quoting:
“There is a snapshot in snapper that is not removable, because it corresponds to the default of subvolume “root”. As far as I can tell, this snapshot is never removed as long as you don’t rollback (then the default is set to the rollbacked snapshot). So what
happens on the long term? All snapshots are progressively removed but that one which serves, in a way, as a “basal reference”. Over time, even with small updates, many parts of the system are changing and the discrepancy between this snapshot and the
new state of the system is getting large. In the extreme, all components are changed and you end up needing a partition twice as large as without snapshots”

That corresponds to the problem I am experiencing. The proposed solution is to do a single rollback to the
current filesystem, which I did. Now snapper list shows only the current snapshot, but the disk usage has
not changed by an inch :frowning: This is really annoying.

It’s seems this problem is widespread, but most people don’t care as they mount / on a large hard disk.
Still, the documentation of snapper says explicitly that snapper should run fine on 16GB disks for most
distributions (my SSD is 30GB and the distribution 12GB). I am surprised there is no simple workaround
this problem.

I’m not using Tumbleweed here, so my remarks might be off-topic, but I think that your problem was highlighted by two factors.

  1. Recently Tumbleweed was totally rebuilt, practically, please see https://lists.opensuse.org/opensuse-factory/2019-02/msg00064.html
    As a consequence any user using snapshots is likely to see a / root partition twice the normal size or more even if only a few snapshots are still on their disks.

  2. 32 GB for / root is less than the recommended minimum of 40 GB for those using snapshots; while you may be able to manage that, likely you are among the first few users noticing space problems when a large update shows up.
    Personally I would use a quieter distro (Leap?) possibly on a EXT4 partition with such a small disk (a tablet or convertible?), but it is your system and your choice as long as you can manage it.

That said, if rollback to a snapshot younger than 20190201 doesn’t fix your problem, maybe there is something else to understand and fix.

Thanks Bruno for your reply. This is in fact a laptop, the 32GB SSD is only for
the operating system, while the /home lies on a different 750GB hard disk. That
configuration should be fine (and has been until recently). I bought this computer
at a time SSD was expensive, and it was a good compromise back then.

I was not aware of the complete rebuild of tumbleweed, and still don’t understand
why it would generate 2 copies of the OS. In fact, this growth process has build
up progressively since I switched to tumbleweed one year ago, so rolling back
will not help. Of course, I could make a fresh reinstall, but my move from LEAP to
tumbleweed was precisely to avoid doing this too frequently…

In case that helps others, here’s the solution that I found.

First, I enabled btrfs quota:
https://btrfs.wiki.kernel.org/index.php/Quota_support#Enabling_quota
which is great to assess how much space each snapshot really takes.
The true used space is now displayed by the usual “snapper list” command.
Then, I noticed that there was one particular big snapshot before the last
“important=yes” snapshot. This snapshot was about 1GB in size, but after
deleting it, I recovered miraculously the missing 12GB :slight_smile:

Basically when all packages are rebuilt you have the entire OS frozen in the “Before” snapshot (or the few siblings leading to that snapshot) and a whole new copy of all packages as the “Current” version of the OS.
Likely, with your last operation you deleted all the remains of the snapshots leading to the recent rebuild (but a TW expert might have a better explanation).
Anyway, nice to read you are swimming in a larger pool again :wink:

Yes, it was indeed a “pre” snapshot that was causing trouble. Thanks
for the clarification, and for the sympathy :slight_smile: