Has Btrfs "wasted" some space?

IBBoard · July 2, 2018, 9:20pm

I’m using Btrfs as my root file system on Tumbleweed, and I’m trying to work out whether it has lost/“wasted” some space. The partition is 25GB and I’m using ~12GB but df and Btrfs say that 24GB is used, despite there only being a few snapshots totalling less than 1GB of exclusive space!

I’ve even restored from a snapshot recently so that I shouldn’t have too many “local” changes on the disk, in case that was the problem, but it doesn’t seem to have helped.

I previously had problems because something was wrong with my Snapper/Btrfs config, but I seemed to have fixed that by restoring a snapshot and manually tidying up subvolume 5.

$ df -h /
Filesystem             Size  Used Avail Use% Mounted on
/dev/mapper/main-root   25G   24G  978M  97% /

$ sudo btrfs fi usage /
Overall:
    Device size:          25.00GiB
    Device allocated:          24.97GiB
    Device unallocated:          33.00MiB
    Device missing:             0.00B
    Used:              23.64GiB
    Free (estimated):         978.80MiB    (min: 978.80MiB)
    Data ratio:                  1.00
    Metadata ratio:              1.00
    Global reserve:          71.73MiB    (used: 0.00B)

Data,single: Size:23.69GiB, Used:22.76GiB
   /dev/mapper/main-root      23.69GiB

Metadata,single: Size:1.25GiB, Used:893.80MiB
   /dev/mapper/main-root       1.25GiB

System,single: Size:32.00MiB, Used:16.00KiB
   /dev/mapper/main-root      32.00MiB

Unallocated:
   /dev/mapper/main-root      33.00MiB

$ sudo btrfs qgroup show --sort=excl /
qgroupid         rfer         excl 
--------         ----         ---- 
0/5          16.00KiB     16.00KiB 
0/257        16.00KiB     16.00KiB 
0/259        16.00KiB     16.00KiB 
0/260        16.00KiB     16.00KiB 
0/1696       10.95GiB     16.00KiB 
0/1697       10.95GiB     16.00KiB 
0/1708       10.98GiB    736.00KiB 
0/1709       10.98GiB      1.05MiB 
0/1684       11.04GiB      1.54MiB 
0/261         3.43MiB      3.43MiB 
0/272         4.61MiB      4.61MiB 
0/1685       10.97GiB      5.28MiB 
0/1711       10.98GiB      6.52MiB 
0/1694       10.95GiB     14.38MiB 
0/1687       10.95GiB     14.89MiB 
0/1710       10.98GiB     28.62MiB 
0/1678       10.73GiB     46.80MiB 
0/1698       11.03GiB     57.37MiB 
0/1675       10.98GiB     68.23MiB 
0/258       184.82MiB    184.82MiB 
0/1683       11.59GiB    232.03MiB 
0/1641       10.65GiB    332.07MiB 
1/0          23.28GiB     12.39GiB

$ sudo snapper list
Type   | #    | Pre # | Date                         | User | Cleanup  | Description        | Userdata     
-------+------+-------+------------------------------+------+----------+--------------------+--------------
single | 0    |       |                              | root |          | current            |              
single | 1258 |       | Mon 18 Jun 2018 19:00:00 BST | root | timeline | timeline           |              
single | 1291 |       | Sun 24 Jun 2018 14:26:40 BST | root |          |                    |              
pre    | 1294 |       | Sun 24 Jun 2018 15:59:55 BST | root | number   | zypp(zypper)       | important=yes
post   | 1299 | 1294  | Sun 24 Jun 2018 19:06:29 BST | root | number   |                    | important=yes
pre    | 1300 |       | Sun 24 Jun 2018 19:27:14 BST | root | number   | zypp(ruby.ruby2.5) | important=no 
post   | 1301 | 1300  | Sun 24 Jun 2018 19:27:20 BST | root | number   |                    | important=no 
single | 1303 |       | Mon 25 Jun 2018 19:00:00 BST | root | timeline | timeline           |              
single | 1310 |       | Fri 29 Jun 2018 20:00:30 BST | root | timeline | timeline           |              
pre    | 1312 |       | Sat 30 Jun 2018 08:56:24 BST | root | number   | zypp(zypper)       | important=yes
single | 1313 |       | Sat 30 Jun 2018 09:00:00 BST | root | timeline | timeline           |              
post   | 1314 | 1312  | Sat 30 Jun 2018 09:09:08 BST | root | number   |                    | important=yes
single | 1324 |       | Sat 30 Jun 2018 19:00:33 BST | root | timeline | timeline           |              
single | 1325 |       | Sat 30 Jun 2018 20:00:33 BST | root | timeline | timeline           |              
single | 1326 |       | Sun 01 Jul 2018 19:00:18 BST | root | timeline | timeline           |              
single | 1327 |       | Mon 02 Jul 2018 20:00:49 BST | root | timeline | timeline           |              

$ mount | grep subvol
/dev/mapper/main-root on / type btrfs (rw,relatime,ssd,space_cache,subvolid=1675,subvol=/@/.snapshots/1291/snapshot)
/dev/mapper/main-root on /.snapshots type btrfs (rw,relatime,ssd,space_cache,subvolid=272,subvol=/@/.snapshots)
/dev/mapper/main-root on /srv type btrfs (rw,relatime,ssd,space_cache,subvolid=259,subvol=/@/srv)
/dev/mapper/main-root on /boot/grub2/i386-pc type btrfs (rw,relatime,ssd,space_cache,subvolid=260,subvol=/@/boot/grub2/i386-pc)
/dev/mapper/main-root on /opt type btrfs (rw,relatime,ssd,space_cache,subvolid=258,subvol=/@/opt)
/dev/mapper/main-root on /boot/grub2/x86_64-efi type btrfs (rw,relatime,ssd,space_cache,subvolid=261,subvol=/@/boot/grub2/x86_64-efi)

$ sudo du -hxs / 
12G    /

So, how can I have a 12GB root, barely anything in the other subvolumes, a total of about 1GB of excl usage from snapshots, and yet have 23GB of 24GB of the data allocation used? And 12GB exclusive usage in qgroup 1/0? Am I misunderstanding something, or has Btrfs managed to “lose” some disk space that it thinks is used when it isn’t?

Thanks.

gogalthorp · July 2, 2018, 9:42pm

25 gig is not enough to support snapper. 40 gig is the smallest recommended size and for TW bigger may be better. TW has lots and lots of updates thus you have lots of space used in snapshots Remove some snapshots and then stop snapper.

IBBoard · July 2, 2018, 9:56pm

I know the recommended size for enabling snapshots, but I only have a 256GB SSD and I’ve not got the space available to dedicate 40GB to a ~12GB root partition. Instead, I just set an aggressive pruning policy and occasionally manually delete snapshots with “snapper delete”. It’s not ideal for a “normal” desktop user, but it works for me.

I can’t see how snapshots are my issue right now, though. If you look at the qgroup output then most of the snapshots are currently using a few tens of MBs of “excl” space (which is what the high churn rate of Tumbleweed updates would increase). The biggest qgroup snapshot uses about 330MB of exclusive space.

The recent restore to the latest snapshot set my default subvolume to 1675, which is snapshot 1291.

sudo btrfs subvolume list -a /
ID 257 gen 39505 top level 5 path <FS_TREE>/@
ID 258 gen 39429 top level 257 path <FS_TREE>/@/opt
ID 259 gen 38820 top level 257 path <FS_TREE>/@/srv
ID 260 gen 34852 top level 257 path <FS_TREE>/@/boot/grub2/i386-pc
ID 261 gen 39222 top level 257 path <FS_TREE>/@/boot/grub2/x86_64-efi
ID 272 gen 39492 top level 257 path <FS_TREE>/@/.snapshots
ID 1641 gen 38909 top level 272 path <FS_TREE>/@/.snapshots/1258/snapshot
ID 1675 gen 39509 top level 272 path <FS_TREE>/@/.snapshots/1291/snapshot
ID 1678 gen 38909 top level 272 path <FS_TREE>/@/.snapshots/1294/snapshot
ID 1683 gen 38910 top level 272 path <FS_TREE>/@/.snapshots/1299/snapshot
ID 1684 gen 38909 top level 272 path <FS_TREE>/@/.snapshots/1300/snapshot
ID 1685 gen 38910 top level 272 path <FS_TREE>/@/.snapshots/1301/snapshot
ID 1687 gen 38910 top level 272 path <FS_TREE>/@/.snapshots/1303/snapshot
ID 1694 gen 39153 top level 272 path <FS_TREE>/@/.snapshots/1310/snapshot
ID 1696 gen 39206 top level 272 path <FS_TREE>/@/.snapshots/1312/snapshot
ID 1697 gen 39208 top level 272 path <FS_TREE>/@/.snapshots/1313/snapshot
ID 1698 gen 39221 top level 272 path <FS_TREE>/@/.snapshots/1314/snapshot
ID 1708 gen 39300 top level 272 path <FS_TREE>/@/.snapshots/1324/snapshot
ID 1709 gen 39313 top level 272 path <FS_TREE>/@/.snapshots/1325/snapshot
ID 1710 gen 39419 top level 272 path <FS_TREE>/@/.snapshots/1326/snapshot
ID 1711 gen 39491 top level 272 path <FS_TREE>/@/.snapshots/1327/snapshot

It is ~11GB and has a mere 70MB of exclusive disk usage (i.e. not used in any other subvolume). Given that so little has changed between then and now then I don’t see how I’ve got so much disk space used!

knurpht · July 2, 2018, 11:13pm

If space is the issue, simply don’t use btrfs, but ext4 or xfs instead. But 25GB for btrfs incl. snapshotting, is too much to ask. The defaults aren’t there for no reason.

malcolmlewis · July 2, 2018, 11:33pm

Hi
AFAIK, if you revert then you may need to manually delete old snapshots (with caution!!).

Else have you manually run the btrfs-balance.service?

IBBoard · July 2, 2018, 11:41pm

Okay, people seem to be getting hung up on the Snapper requirements. Please can we just put that to one side for now and I’ll rephrase the question.

Say I have a Btrfs filesystem of 25GB. Say it has a number of sub-volumes. Say each of those sub-volumes uses ~11GB that their respective qgroup shows as “rfer” but due to content duplication then only tens to a few hundred MB of that space is “excl”. Say the qgroup for each of those sub-volumes is in a parent qgroup (1/0).

Given that situation, why is Btrfs showing 23GB used with 12GB “excl” in qgroup 1/0? Why is its disk usage significantly more than smallest (or even the largest) “rfer” plus the sum of the “excl”?

My expectation would be that 1/0 has an excl that’s the sum of the child excl, because the qgroup groups together lots of child groups. But there’s roughly twice as much data in the qgroup as the maths would imply.

IBBoard · July 2, 2018, 11:46pm

Thanks for the reply.

Yep, I’ve been doing that - see the short list of snapshots in post #1 from a machine that was installed in October. Due to the space constraints then I’ve been deleting old snapshots when I ran low on space as well. Also, Snapper is set to only keep a limited timeline and three Important updates*.

AFAIK then Snapper should still be able to delete old snapshots after you restore (because it just sees another snapshot and doesn’t “know” that you jumped back in time). I didn’t think that I had to be too careful, though, because Snapper should stop you deleting the important ones (e.g. the one that’s your base subvolume)?

I’ve not run btrfs-balance.service manually, but I think it ran once in the past month. My machine was certainly a bit sluggish at one point for a while, and I think I checked what was running and saw the btrfs process. I’ll give it a go and see if it recalculates anything.

because it’s a desktop, not a production server, and if I update and boot and it works then it’s all likely to be fine. Either that or it broke and I revert there and then.

[Edit] Nope, the btrfs-balance service doesn’t seem to help. Ends almost immediately. I checked systemctl status btrfs-balance and it says “Done, had to relocate 1 out of 37 chunks” and still shows basically the same disk usage. Thanks for the idea, though.

malcolmlewis · July 3, 2018, 1:43am

IB Board:

Thanks for the reply.

Yep, I’ve been doing that - see the short list of snapshots in post #1 from a machine that was installed in October. Due to the space constraints then I’ve been deleting old snapshots when I ran low on space as well. Also, Snapper is set to only keep a limited timeline and three Important updates*.

AFAIK then Snapper should still be able to delete old snapshots after you restore (because it just sees another snapshot and doesn’t “know” that you jumped back in time). I didn’t think that I had to be too careful, though, because Snapper should stop you deleting the important ones (e.g. the one that’s your base subvolume)?

I’ve not run btrfs-balance.service manually, but I think it ran once in the past month. My machine was certainly a bit sluggish at one point for a while, and I think I checked what was running and saw the btrfs process. I’ll give it a go and see if it recalculates anything.

because it’s a desktop, not a production server, and if I update and boot and it works then it’s all likely to be fine. Either that or it broke and I revert there and then.

[Edit] Nope, the btrfs-balance service doesn’t seem to help. Ends almost immediately. I checked systemctl status btrfs-balance and it says “Done, had to relocate 1 out of 37 chunks” and still shows basically the same disk usage. Thanks for the idea, though.

Hi
I’m not running snapshots on my Tumbleweed setup, but do have 40G allocated for / I’m only using 9.72GiB and 11.29GiB allocated…

You might have to run manually and be aggressive with the level;

I would also look if there are lots of coredumps and journal logs…


coredumpctl list

arvidjaar · July 3, 2018, 6:15am

Please show output of “btrfs qgroup show -p”.

IBBoard · July 3, 2018, 8:29pm

A manual rebalance didn’t help. It just said “0 chunks reallocated”. I don’t know what “aggressive” arguments look like, but it came back so quickly that I assumed that the previous run of the service did a pretty good job.

I then tried being even more ruthless with snapshots and deleted all of the timeline ones (because hardly anything changes on the root of this system except upgrades) and all of the “post” ones (because they pretty much match the next “pre” snapshot).

The idea was that it minimised the problem space, and I think it helped me work out where the disk space went!

$ sudo snapper list
Type   | #    | Pre # | Date                         | User | Cleanup | Description        | Userdata     
-------+------+-------+------------------------------+------+---------+--------------------+--------------
single | 0    |       |                              | root |         | current            |              
single | 1291 |       | Sun 24 Jun 2018 14:26:40 BST | root |         |                    |              
pre    | 1294 |       | Sun 24 Jun 2018 15:59:55 BST | root | number  | zypp(zypper)       | important=yes
pre    | 1300 |       | Sun 24 Jun 2018 19:27:14 BST | root | number  | zypp(ruby.ruby2.5) | important=no 
pre    | 1312 |       | Sat 30 Jun 2018 08:56:24 BST | root | number  | zypp(zypper)       | important=yes

$ sudo btrfs fi df /
Data, single: total=23.69GiB, used=22.22GiB
System, single: total=32.00MiB, used=16.00KiB
Metadata, single: total=1.25GiB, used=716.22MiB
GlobalReserve, single: total=68.81MiB, used=0.00B

$ sudo btrfs qgroup show -p /
qgroupid         rfer         excl parent  
--------         ----         ---- ------  
0/5          16.00KiB     16.00KiB ---     
0/257        16.00KiB     16.00KiB ---     
0/258       184.82MiB    184.82MiB ---     
0/259        16.00KiB     16.00KiB ---     
0/260        16.00KiB     16.00KiB ---     
0/261         3.43MiB      3.43MiB ---     
0/272        16.00KiB     16.00KiB ---     
0/1675       10.98GiB      2.62GiB ---     
0/1678       10.73GiB      8.77GiB 1/0     
0/1684       11.04GiB    314.80MiB 1/0     
0/1696       10.95GiB    234.43MiB 1/0     
1/0          20.04GiB     11.68GiB ---

As I pasted that into the post then I noticed that subvolume 1678 now has 8.7GB “excl”! It wasn’t anywhere near that before. And then I realised why.

If I’ve got timeline snapshots every hour then roughly one hour after an upgrade I get a timeline snapshot that will have 99%+ of its data in common with the “post” snapshot. A few things might change, but nothing much. That results in very little “excl” data reported for either snapshot, because it isn’t exclusive - it’s now common to both snapshots. At the next upgrade then lots of files change and I’ve still got that duplication from the last upgrade, but eventually neither of them show much “excl” usage despite the large file sizes.

After deleting all the timeline snapshots and snapshot 1294 (subvolume 1678), I’ve now got 11GB available!

What that means for my system is that I need to prune timeline snapshots more frequently (because I don’t edit my system config frequently, so one hour back or one day back won’t make much difference). By doing that then the overlap between a timeline snapshot and an update (“number”) snapshot won’t hide the “excl” usage, which will make it clearer where the space is used in snapshots.

TL;DR: Btrfs hadn’t “wasted” or lost space, it was just that timeline snapshots were hiding how much I was using by making the “excl” numbers initially unintuitive.

Thanks to arvidjaar and malcolmlewis for talking me through steps that identified it and helped me understand it.

arvidjaar · July 4, 2018, 7:04am

“Exclusive” means “how much space is not shared with subvolumes outside of this qgroup”, while “Refer” means total space referred to by this qgroup. So Exclusive number for 1/0 qgroup shows how much space your snapshots consume in total, which actually answers “where has my space gone”, while “exclusive” for each individual subvolume can be interpreted as “how much space I gain after deleting this snapshot”. Due to shared nature deleting snapshot may suddenly make a lot of space in adjoining snapshots exclusive.

IBBoard · July 7, 2018, 9:03pm

I knew those definitions and thought that I understood it, but it was this that I wasn’t thinking about properly:

Because Btrfs snapshots are Copy On Write then I was thinking that “excl” was effectively a difference to the previous one, and so I thought I should be able to add all of the 0/x “excl” values to a base value and find out how much disk space I was using.

As you say, though, snapshots can overlap (and often will - a “post” will often be very similar to the following “pre” from the next update, and the same for timeline snapshots) and so you can’t just add the numbers up that way.

I had thought that it would be odd if Btrfs had “lost” some space somewhere, but it wasn’t until people talked through the diagnosis that it made sense.

Thanks for explaining things.

arvidjaar · July 8, 2018, 8:46am

This is true only as long as there is a single snapshot (and there is no other data sharing). And even then you need to have clear definition what “difference” means exactly.

As soon as space is captured in two or more snapshots, it will always be accounted as shared. Here is trivial example:

leap15:/home/bor # mkfs -t btrfs -f /dev/sdb1btrfs-progs v4.15
See http://btrfs.wiki.kernel.org for more information.


Performing full device TRIM /dev/sdb1 (500.00GiB) ...
Label:              (null)
UUID:               b1935d4b-8c09-40da-bb06-c56cca1025c5
Node size:          16384
Sector size:        4096
Filesystem size:    500.00GiB
Block group profiles:
  Data:             single            8.00MiB
  Metadata:         DUP               1.00GiB
  System:           DUP               8.00MiB
SSD detected:       no
Incompat features:  extref, skinny-metadata
Number of devices:  1
Devices:
   ID        SIZE  PATH
    1   500.00GiB  /dev/sdb1


leap15:/home/bor # mount /dev/sdb1 /mnt
leap15:/home/bor # dd if=/dev/urandom of=/mnt/bigfile bs=1K count=201400
201400+0 records in
201400+0 records out
206233600 bytes (206 MB, 197 MiB) copied, 1.8202 s, 113 MB/s
leap15:/home/bor # fsync /mnt
leap15:/home/bor # btrfs quota enable /mnt
leap15:/home/bor # btrfs quota rescan -w /mnt
quota rescan started
leap15:/home/bor # btrfs qgroup show -p /mnt
qgroupid         rfer         excl parent  
--------         ----         ---- ------  
0/5         196.71MiB    196.71MiB ---     
leap15:/home/bor # btrfs su sn -r /mnt /mnt/snap1
leap15:/home/bor # btrfs su sn -r /mnt /mnt/snap2
eap15:/home/bor # btrfs qgroup show -p /mnt
qgroupid         rfer         excl parent  
--------         ----         ---- ------  
0/5         196.71MiB     16.00KiB ---     
0/258       196.71MiB     16.00KiB ---     
0/259       196.71MiB     16.00KiB ---

So at this point we have two snapshots with zero exclusive space. So far this is correct. We did not have change anything on active filesystem since snapshots had been created. Let’s do large scale changes on active filesystem.

leap15:/home/bor # dd if=/dev/urandom of=/mnt/bigfile bs=1K count=201400
201400+0 records in
201400+0 records out
206233600 bytes (206 MB, 197 MiB) copied, 1.80381 s, 114 MB/s
leap15:/home/bor # fsync /mnt
leap15:/home/bor # btrfs qgroup show --sync -p /mnt
qgroupid         rfer         excl parent  
--------         ----         ---- ------  
0/5         196.71MiB    196.71MiB ---     
0/258       196.71MiB     16.00KiB ---     
0/259       196.71MiB     16.00KiB ---

Oops. Both snapshots still have zero exclusive space. There is no way to use these metrics to determine how much space is consumed in total:

leap15:/home/bor # df -h /mnt
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb1       500G  412M  498G   1% /mnt

Let’s create some more snapshots and do some more changes.

leap15:/home/bor # btrfs su sn -r /mnt /mnt/snap3
Create a readonly snapshot of '/mnt' in '/mnt/snap3'
leap15:/home/bor # btrfs su sn -r /mnt /mnt/snap4
Create a readonly snapshot of '/mnt' in '/mnt/snap4'
leap15:/home/bor # rm /mnt/bigfile
leap15:/home/bor # btrfs qgroup show --sync -p /mnt
qgroupid         rfer         excl parent  
--------         ----         ---- ------  
0/5          16.00KiB     16.00KiB ---     
0/258       196.71MiB     16.00KiB ---     
0/259       196.71MiB     16.00KiB ---     
0/260       196.71MiB     16.00KiB ---     
0/261       196.71MiB     16.00KiB ---     
leap15:/home/bor # df -h /mnt
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb1       500G  412M  498G   1% /mnt

Still the same situation. All snapshots look entirely identical, there is zero data in active filesystem still total space consumption did not change.

What is possible, is computing running total across consecutive snapshots and using delta as indication of change rate. Like

leap15:/home/bor # btrfs qgroup create 1/258 /mnt
leap15:/home/bor # btrfs qgroup create 1/259 /mnt
leap15:/home/bor # btrfs qgroup create 1/260 /mnt
leap15:/home/bor # btrfs qgroup create 1/261 /mnt
leap15:/home/bor # btrfs qgroup assign 0/258 1/258 /mnt
leap15:/home/bor # btrfs qgroup assign 0/258 1/259 /mnt
leap15:/home/bor # btrfs qgroup assign 0/259 1/259 /mnt
leap15:/home/bor # btrfs qgroup assign 0/258 1/260 /mnt
leap15:/home/bor # btrfs qgroup assign 0/259 1/260 /mnt
leap15:/home/bor # btrfs qgroup assign 0/260 1/260 /mnt
leap15:/home/bor # btrfs qgroup assign 0/258 1/261 /mnt
leap15:/home/bor # btrfs qgroup assign 0/259 1/261 /mnt
leap15:/home/bor # btrfs qgroup assign 0/260 1/261 /mnt
leap15:/home/bor # btrfs qgroup assign --rescan 0/261 1/261 /mnt
leap15:/home/bor # btrfs quota rescan -s /mnt 
no rescan operation in progress
leap15:/home/bor # btrfs qgroup show --sync -p /mnt
qgroupid         rfer         excl parent                  
--------         ----         ---- ------                  
0/5          16.00KiB     16.00KiB ---                     
0/258       196.71MiB     16.00KiB ---                     
0/259       196.71MiB     16.00KiB ---                     
0/260       196.71MiB     16.00KiB ---                     
0/261       196.71MiB     16.00KiB ---                     
1/258       196.71MiB     16.00KiB 0/258                   
1/259       196.73MiB    196.73MiB 0/258,0/259             
1/260       393.45MiB    196.75MiB 0/258,0/259,0/260       
1/261       393.46MiB    393.46MiB 0/258,0/259,0/260,0/261 
leap15:/home/bor #

Now we can actually see something. To get estimation how much data was changed or deleted since snapshot, look at delta between excl of this snapshot and previous one. To get estimation how much data was added since snapshot, look at delta between rfer of next snapshot and this one.So we can see that between 0/258 and 0/259 a lot of data was changed because amount of new data is approximately the same as amount of deleted data. We see that after 0/261 much data was deleted or changed; to actually find out we can create one more snapshot and compute delta. Skipping commands

leap15:/home/bor # btrfs qgroup show --sync -p /mnt
qgroupid         rfer         excl parent                        
--------         ----         ---- ------                        
0/5          16.00KiB     16.00KiB ---                           
0/258       196.71MiB     16.00KiB ---                           
0/259       196.71MiB     16.00KiB ---                           
0/260       196.71MiB     16.00KiB ---                           
0/261       196.71MiB     16.00KiB ---                           
0/262        16.00KiB     16.00KiB ---                           
1/258       196.71MiB     16.00KiB 0/258                         
1/259       196.73MiB    196.73MiB 0/258,0/259                   
1/260       393.45MiB    196.75MiB 0/258,0/259,0/260             
1/261       393.46MiB    393.46MiB 0/258,0/259,0/260,0/261       
1/262       393.48MiB    393.48MiB 0/258,0/259,0/260,0/261,0/262 
leap15:/home/bor #

So there is no new data between 0/261 and 0/262 which implies much was deleted but nothing overwritten or added.

Of course this is not precise. Removing one file and adding another cannot be distinguished from overwriting the same file using these metrics. But that does not matter much as for the purpose of snapshot space consumption both cases are the same - changing file can be considered as removing old (implicitly preserving it in snapshot) and adding new with the same name.

Unfortunately as can be seen management is really cumbersome. It would be good if more people played with it to get better understanding of requirements. Then we could enhance snapper to manage and display this information. But more real life confirmation is needed.

IBBoard · July 8, 2018, 8:34pm

Thanks for all of those worked examples. It looks like some of it is possible to calculate, but it looks like the counts available from Btrfs don’t make it easy.

Given that Snapper doesn’t list space usage by snapshots and you’ve got to do your own matching of results across three commands (snapper, btrfs qgroup and btrfs subvolume) then it would initially be nice if it could at least print the values and give clear explanations. I’m not sure how easy it is to word, though: “Total space” and “Space freed up after deletion (but could be more if you delete another snapshot first” doesn’t work too well!

I might put a feature request in for Snapper for an alternate cleanup that would indirectly make this clearer: Deleting “Post” snapshots (but not Pre) once there is a subsequent snapshot. The way I look at snapshots then you need the Pre in case your update breaks something, and you need the Timeline in case you edit a config and break it, but all that Post does is captures the state after an upgrade (which subsequent timeline snapshots will do anyway) and overlaps with the next Pre snapshot (so you don’t obviously see the disk usage).

The only drawback would be if you upgrade, screw a config, timeline snapshot, delete the Post, then find you have to roll back to the Pre and do the upgrade again, but that’s a corner case and it’d be why you don’t make it a default enabled setting.

the_wumpus · December 20, 2021, 1:43am

I am just facing the same problem, when my ~100G root partition run full. Deleting all snapshots and obsolete files reclaimed ~13G. However, “du” only shows 20G usage, so the difference is still 67G. I then stumbled over this thread, however I still have no idea how to fix this issue.

Current status:


HAL:/ # df /
Filesystem               1K-blocks     Used Available Use% Mounted on 
/dev/mapper/VG_root-root 104701952 90128532  13879340  87% /

HAL:/ # du -xhs /           
20G     /

HAL:/ # snapper list 
    # | Type   | Pre # | Date                     | User | Used Space | Cleanup | Description            | Userdata 
------+--------+-------+--------------------------+------+------------+---------+------------------------+--------- 
   0  | single |       |                          | root |            |         | current                |          
1843* | single |       | Wed Jul 14 20:28:39 2021 | root |  18.63 GiB |         | writable copy of #1833 |  

HAL:/ # btrfs fi df /      
Data, single: total=85.00GiB, used=84.59GiB 
System, single: total=32.00MiB, used=16.00KiB 
Metadata, single: total=2.00GiB, used=1.21GiB 
GlobalReserve, single: total=168.98MiB, used=0.00B

HAL:/ # btrfs qgroup show -p /      
qgroupid         rfer         excl parent   
--------         ----         ---- ------   
0/5          16.00KiB     16.00KiB ---      
0/257        16.00KiB     16.00KiB ---      
0/258        16.00KiB     16.00KiB ---      
0/260         2.38MiB      2.38MiB ---      
0/261         3.80MiB      3.80MiB ---      
0/262        49.88GiB     49.88GiB ---      
0/263        36.73MiB     36.73MiB ---      
0/264       489.46MiB    489.46MiB ---      
0/265        16.00KiB     16.00KiB ---      
0/266        16.00KiB     16.00KiB ---      
0/267        16.00KiB     16.00KiB ---      
0/268        16.00KiB     16.00KiB ---      
0/269        16.00KiB     16.00KiB ---      
0/270        16.00KiB     16.00KiB ---      
0/271       123.62MiB    123.62MiB ---      
0/272        16.00KiB     16.00KiB ---      
0/273        16.00KiB     16.00KiB ---      
0/274         5.26GiB      5.26GiB ---      
0/275        16.00KiB     16.00KiB ---      
0/276       101.92MiB    101.92MiB ---      
0/277       307.10MiB    307.10MiB ---      
0/1238       10.22GiB      9.86GiB ---      
0/1739      571.47MiB    571.47MiB ---      
0/4112       19.00GiB     18.63GiB ---      
1/0             0.00B        0.00B ---

As can be seen, there are some qgroupid with “excl” of ~50 GiB, ~19 GiB, ~10 GiB (I have to admit that I am not (yet?) an expert in btrfs quota/qgroup stuff, with this respect I did not change the installation setup).
Yet, IBBoard reported that after deleting all “timeline” snapshots and the final one, he got the disk space reclaimed, but this did not work in my case.

Thanx for any help !

arvidjaar · December 20, 2021, 5:43am

Show output of

btrfs subvolume get-default /
btrfs subvolume list /
grep -w btrfs /proc/mounts
cat /etc/fstab

karlmistelberger · December 20, 2021, 5:46am

the_wumpus:

I am just facing the same problem, when my ~100G root partition run full. Deleting all snapshots and obsolete files reclaimed ~13G. However, “du” only shows 20G usage, so the difference is still 67G. I then stumbled over this thread, however I still have no idea how to fix this issue.

Current status:


HAL:/ # df /
Filesystem               1K-blocks     Used Available Use% Mounted on 
/dev/mapper/VG_root-root 104701952 90128532  13879340  87% /

HAL:/ # du -xhs /           
20G     /

HAL:/ # snapper list 
    # | Type   | Pre # | Date                     | User | Used Space | Cleanup | Description            | Userdata 
------+--------+-------+--------------------------+------+------------+---------+------------------------+--------- 
   0  | single |       |                          | root |            |         | current                |          
1843* | single |       | Wed Jul 14 20:28:39 2021 | root |  18.63 GiB |         | writable copy of #1833 |  

HAL:/ # btrfs fi df /      
Data, single: total=85.00GiB, used=84.59GiB 
System, single: total=32.00MiB, used=16.00KiB 
Metadata, single: total=2.00GiB, used=1.21GiB 
GlobalReserve, single: total=168.98MiB, used=0.00B

HAL:/ # btrfs qgroup show -p /      
qgroupid         rfer         excl parent   
--------         ----         ---- ------   
0/5          16.00KiB     16.00KiB ---      
0/257        16.00KiB     16.00KiB ---      
0/258        16.00KiB     16.00KiB ---      
0/260         2.38MiB      2.38MiB ---      
0/261         3.80MiB      3.80MiB ---      
0/262        49.88GiB     49.88GiB ---      
0/263        36.73MiB     36.73MiB ---      
0/264       489.46MiB    489.46MiB ---      
0/265        16.00KiB     16.00KiB ---      
0/266        16.00KiB     16.00KiB ---      
0/267        16.00KiB     16.00KiB ---      
0/268        16.00KiB     16.00KiB ---      
0/269        16.00KiB     16.00KiB ---      
0/270        16.00KiB     16.00KiB ---      
0/271       123.62MiB    123.62MiB ---      
0/272        16.00KiB     16.00KiB ---      
0/273        16.00KiB     16.00KiB ---      
0/274         5.26GiB      5.26GiB ---      
0/275        16.00KiB     16.00KiB ---      
0/276       101.92MiB    101.92MiB ---      
0/277       307.10MiB    307.10MiB ---      
0/1238       10.22GiB      9.86GiB ---      
0/1739      571.47MiB    571.47MiB ---      
0/4112       19.00GiB     18.63GiB ---      
1/0             0.00B        0.00B ---

As can be seen, there are some qgroupid with “excl” of ~50 GiB, ~19 GiB, ~10 GiB (I have to admit that I am not (yet?) an expert in btrfs quota/qgroup stuff, with this respect I did not change the installation setup).
Yet, IBBoard reported that after deleting all “timeline” snapshots and the final one, he got the disk space reclaimed, but this did not work in my case.

Thanx for any help !

Show usage:

**erlangen:~ #** btrfs filesystem usage -T /            
Overall: 
    Device size:                   1.82TiB 
    Device allocated:            343.04GiB 
    Device unallocated:            1.48TiB 
    Device missing:                  0.00B 
    Used:                        332.40GiB 
    Free (estimated):              1.49TiB      (min: 1.49TiB) 
    Free (statfs, df):             1.49TiB 
    Data ratio:                       1.00 
    Metadata ratio:                   1.00 
    Global reserve:              494.58MiB      (used: 0.00B) 
    Multiple profiles:                  no 

                  Data      Metadata System               
Id Path           single    DUP      DUP      Unallocated 
-- -------------- --------- -------- -------- ----------- 
 1 /dev/nvme0n1p2 340.01GiB  3.00GiB 32.00MiB     1.48TiB 
-- -------------- --------- -------- -------- ----------- 
   Total          340.01GiB  3.00GiB 32.00MiB     1.48TiB 
   Used           330.62GiB  1.78GiB 64.00KiB             
**erlangen:~ #**

There is used space, free space, allocated space and unallocated space but there is no wasted space. See also https://forums.opensuse.org/showthread.php/562626-Snapper-Funktionsweise

the_wumpus · December 20, 2021, 1:40pm


HAL:/ #  btrfs subvolume get-default / 
ID 4112 gen 375320 top level 258 path @/.snapshots/1843/snapshot 

HAL:/ # btrfs subvolume list / 
ID 257 gen 374779 top level 5 path @ 
ID 258 gen 375222 top level 257 path @/.snapshots 
ID 260 gen 375158 top level 257 path @/boot/grub2/i386-pc 
ID 261 gen 375158 top level 257 path @/boot/grub2/x86_64-efi 
ID 262 gen 375298 top level 257 path @/opt 
ID 263 gen 375316 top level 257 path @/tmp 
ID 264 gen 375300 top level 257 path @/usr/local 
ID 265 gen 374779 top level 257 path @/var/cache.old 
ID 266 gen 375158 top level 257 path @/var/crash 
ID 267 gen 375158 top level 257 path @/var/lib/libvirt/images 
ID 268 gen 375158 top level 257 path @/var/lib/machines 
ID 269 gen 375158 top level 257 path @/var/lib/mailman 
ID 270 gen 375158 top level 257 path @/var/lib/mariadb 
ID 271 gen 375311 top level 257 path @/var/lib/mysql 
ID 272 gen 375158 top level 257 path @/var/lib/named 
ID 273 gen 375158 top level 257 path @/var/lib/pgsql 
ID 274 gen 375320 top level 257 path @/var/log 
ID 275 gen 375158 top level 257 path @/var/opt 
ID 276 gen 375320 top level 257 path @/var/spool 
ID 277 gen 375320 top level 257 path @/var/tmp 
ID 1238 gen 375158 top level 258 path @/.snapshots/787/snapshot 
ID 1739 gen 375311 top level 257 path @/var/cache 
ID 4112 gen 375320 top level 258 path @/.snapshots/1843/snapshot 

HAL:/ # grep -w btrfs /proc/mounts 
/dev/mapper/VG_root-root / btrfs rw,relatime,ssd,space_cache,subvolid=4112,subvol=/@/.snapshots/1843/snapshot 0 0 
/dev/mapper/VG_root-root /.snapshots btrfs rw,relatime,ssd,space_cache,subvolid=258,subvol=/@/.snapshots 0 0 
/dev/mapper/VG_root-root /boot/grub2/i386-pc btrfs rw,relatime,ssd,space_cache,subvolid=260,subvol=/@/boot/grub2/i386-pc 0 0 
/dev/mapper/VG_root-root /boot/grub2/x86_64-efi btrfs rw,relatime,ssd,space_cache,subvolid=261,subvol=/@/boot/grub2/x86_64-efi 0 0 
/dev/mapper/VG_root-home /home btrfs rw,relatime,ssd,space_cache,subvolid=257,subvol=/@ 0 0 
/dev/mapper/VG_root-root /opt btrfs rw,relatime,ssd,space_cache,subvolid=262,subvol=/@/opt 0 0 
/dev/mapper/VG_root-root /var/lib/libvirt/images btrfs rw,relatime,ssd,space_cache,subvolid=267,subvol=/@/var/lib/libvirt/images 0 0 
/dev/mapper/VG_root-root /var/lib/machines btrfs rw,relatime,ssd,space_cache,subvolid=268,subvol=/@/var/lib/machines 0 0 
/dev/mapper/VG_root-root /var/crash btrfs rw,relatime,ssd,space_cache,subvolid=266,subvol=/@/var/crash 0 0 
/dev/mapper/VG_root-root /var/lib/mailman btrfs rw,relatime,ssd,space_cache,subvolid=269,subvol=/@/var/lib/mailman 0 0 
/dev/mapper/VG_root-root /var/opt btrfs rw,relatime,ssd,space_cache,subvolid=275,subvol=/@/var/opt 0 0 
/dev/mapper/VG_root-root /tmp btrfs rw,relatime,ssd,space_cache,subvolid=263,subvol=/@/tmp 0 0 
/dev/mapper/VG_root-root /var/lib/pgsql btrfs rw,relatime,ssd,space_cache,subvolid=273,subvol=/@/var/lib/pgsql 0 0 
/dev/mapper/VG_root-root /var/lib/named btrfs rw,relatime,ssd,space_cache,subvolid=272,subvol=/@/var/lib/named 0 0 
/dev/mapper/VG_root-root /var/lib/mariadb btrfs rw,relatime,ssd,space_cache,subvolid=270,subvol=/@/var/lib/mariadb 0 0 
/dev/mapper/VG_root-root /var/cache btrfs rw,relatime,ssd,space_cache,subvolid=1739,subvol=/@/var/cache 0 0 
/dev/mapper/VG_data-srv /srv btrfs rw,relatime,ssd,space_cache,subvolid=257,subvol=/@ 0 0 
/dev/mapper/VG_root-root /var/lib/mysql btrfs rw,relatime,ssd,space_cache,subvolid=271,subvol=/@/var/lib/mysql 0 0 
/dev/mapper/VG_root-root /var/spool btrfs rw,relatime,ssd,space_cache,subvolid=276,subvol=/@/var/spool 0 0 
/dev/mapper/VG_root-root /usr/local btrfs rw,relatime,ssd,space_cache,subvolid=264,subvol=/@/usr/local 0 0 
/dev/mapper/VG_root-root /var/log btrfs rw,relatime,ssd,space_cache,subvolid=274,subvol=/@/var/log 0 0 
/dev/mapper/VG_root-root /var/tmp btrfs rw,relatime,ssd,space_cache,subvolid=277,subvol=/@/var/tmp 0 0 

HAL:/ # cat /etc/fstab 
UUID=05d90657-5a28-405e-a70b-97ffedae614d  /                        btrfs  defaults                        0  0 
UUID=05d90657-5a28-405e-a70b-97ffedae614d  /boot/grub2/i386-pc      btrfs  subvol=@/boot/grub2/i386-pc     0  0 
UUID=05d90657-5a28-405e-a70b-97ffedae614d  /boot/grub2/x86_64-efi   btrfs  subvol=@/boot/grub2/x86_64-efi  0  0 
UUID=05d90657-5a28-405e-a70b-97ffedae614d  /opt                     btrfs  subvol=@/opt                    0  0 
UUID=05d90657-5a28-405e-a70b-97ffedae614d  /tmp                     btrfs  subvol=@/tmp                    0  0 
UUID=05d90657-5a28-405e-a70b-97ffedae614d  /usr/local               btrfs  subvol=@/usr/local              0  0 
UUID=05d90657-5a28-405e-a70b-97ffedae614d  /var/cache               btrfs  subvol=@/var/cache              0  0 
UUID=05d90657-5a28-405e-a70b-97ffedae614d  /var/crash               btrfs  subvol=@/var/crash              0  0 
UUID=05d90657-5a28-405e-a70b-97ffedae614d  /var/lib/libvirt/images  btrfs  subvol=@/var/lib/libvirt/images  0  0 
UUID=05d90657-5a28-405e-a70b-97ffedae614d  /var/lib/machines        btrfs  subvol=@/var/lib/machines       0  0 
UUID=05d90657-5a28-405e-a70b-97ffedae614d  /var/lib/mailman         btrfs  subvol=@/var/lib/mailman        0  0 
UUID=05d90657-5a28-405e-a70b-97ffedae614d  /var/lib/mariadb         btrfs  subvol=@/var/lib/mariadb        0  0 
UUID=05d90657-5a28-405e-a70b-97ffedae614d  /var/lib/mysql           btrfs  subvol=@/var/lib/mysql          0  0 
UUID=05d90657-5a28-405e-a70b-97ffedae614d  /var/lib/named           btrfs  subvol=@/var/lib/named          0  0 
UUID=05d90657-5a28-405e-a70b-97ffedae614d  /var/lib/pgsql           btrfs  subvol=@/var/lib/pgsql          0  0 
UUID=05d90657-5a28-405e-a70b-97ffedae614d  /var/log                 btrfs  subvol=@/var/log                0  0 
UUID=05d90657-5a28-405e-a70b-97ffedae614d  /var/opt                 btrfs  subvol=@/var/opt                0  0 
UUID=05d90657-5a28-405e-a70b-97ffedae614d  /var/spool               btrfs  subvol=@/var/spool              0  0 
UUID=05d90657-5a28-405e-a70b-97ffedae614d  /var/tmp                 btrfs  subvol=@/var/tmp                0  0 
UUID=05d90657-5a28-405e-a70b-97ffedae614d  /.snapshots              btrfs  subvol=@/.snapshots             0  0 
UUID=1e8c71fd-f207-4914-a6c9-138e43614602  /home                    btrfs  defaults                        0  0 
UUID=5f13a37d-eef5-434c-9bb1-0679eba71024  /srv                     btrfs  defaults                        0  0 
UUID=5079-40EC                             /boot/efi                vfat   defaults                        0  0 
UUID=ae7bdd69-f80f-4387-8397-7b0e6409b538  swap                     swap   defaults                        0  0

the_wumpus · December 20, 2021, 2:29pm

karlmistelberger:

Show usage:

**erlangen:~ #** btrfs filesystem usage -T /            
Overall: 
    Device size:                   1.82TiB 
    Device allocated:            343.04GiB 
    Device unallocated:            1.48TiB 
    Device missing:                  0.00B 
    Used:                        332.40GiB 
    Free (estimated):              1.49TiB      (min: 1.49TiB) 
    Free (statfs, df):             1.49TiB 
    Data ratio:                       1.00 
    Metadata ratio:                   1.00 
    Global reserve:              494.58MiB      (used: 0.00B) 
    Multiple profiles:                  no 

                  Data      Metadata System               
Id Path           single    DUP      DUP      Unallocated 
-- -------------- --------- -------- -------- ----------- 
 1 /dev/nvme0n1p2 340.01GiB  3.00GiB 32.00MiB     1.48TiB 
-- -------------- --------- -------- -------- ----------- 
   Total          340.01GiB  3.00GiB 32.00MiB     1.48TiB 
   Used           330.62GiB  1.78GiB 64.00KiB             
**erlangen:~ #**

There is used space, free space, allocated space and unallocated space but there is no wasted space. See also https://forums.opensuse.org/showthread.php/562626-Snapper-Funktionsweise


HAL:/ #  btrfs filesystem usage -T /   
Overall: 
    Device size:                  99.85GiB 
    Device allocated:             87.03GiB 
    Device unallocated:           12.82GiB 
    Device missing:                  0.00B 
    Used:                         85.76GiB 
    Free (estimated):             13.27GiB      (min: 13.27GiB) 
    Data ratio:                       1.00 
    Metadata ratio:                   1.00 
    Global reserve:              169.11MiB      (used: 0.00B) 

                            Data     Metadata System               
Id Path                     single   single   single   Unallocated 
-- ------------------------ -------- -------- -------- ----------- 
 1 /dev/mapper/VG_root-root 85.00GiB  2.00GiB 32.00MiB    12.82GiB 
-- ------------------------ -------- -------- -------- ----------- 
   Total                    85.00GiB  2.00GiB 32.00MiB    12.82GiB 
   Used                     84.55GiB  1.21GiB 16.00KiB

I just checked the link to the thread, and this confirmed my understanding that snapshots contain references to identical files, so that these files do not allocate disk space multiple times. Of course, this means that disk space is only freed after all references to a specific file have been deleted. This was the reason why I deleted all snapshots except two (which cannot be deleted) in the attemt to reclaim disk space.

The two snapshots which cannot be deleted are

#0: the currently mounted snapshot
#1843: the currently mounted snapshot

The last one is, if I remember correctly, the result of a “rollback” action required due to a failed update.

When I list my snapshots, I get:


HAL:/.snapshots # snapper ls 
    # | Typ    | Vorher # | Datum                        | Benutzer | Verwendeter Platz | Bereinigen | Beschreibung           | Benutzerdaten 
------+--------+----------+------------------------------+----------+-------------------+------------+------------------------+-------------- 
   0  | single |          |                              | root     |                   |            | current                |               
1843* | single |          | Mi 14 Jul 2021 20:28:39 CEST | root     |         18,63 GiB |            | writable copy of #1833 |

which shows only the two snapshots above. However, If I list the “.snapshot” directory, I get:


HAL:/.snapshots # ll 
total 4 
drwxr-xr-x 1 root root  32 Jul 14 20:28 1843 
drwxr-xr-x 1 root root  16 Feb 11  2018 787 
-rw-r----- 1 root root 184 Dec 20 01:13 grub-snapshot.cfg

which shows another one (“787”). I must admit I do currently have got no idea whether this is correct or not.

arvidjaar · December 20, 2021, 3:55pm

the_wumpus:

HAL:/ # btrfs qgroup show -p /      
qgroupid         rfer         excl parent   
--------         ----         ---- ------   
0/262        49.88GiB     49.88GiB ---      
...
0/1238       10.22GiB      9.86GiB ---      
...

the_wumpus:


HAL:/ # btrfs subvolume list / 
...
ID 262 gen 375298 top level 257 path @/opt 
...
ID 1238 gen 375158 top level 258 path @/.snapshots/787/snapshot 
...
HAL:/ # grep -w btrfs /proc/mounts 
...
/dev/mapper/VG_root-root /opt btrfs rw,relatime,ssd,space_cache,subvolid=262,subvol=/@/opt 0 0

You have 50G under /opt. Check the content of this directory.

Also subvolume 1238 is apparently orphaned snapper snapshot. If you delete it, it will free 10G.

btrfs subvolume delete /.snapshots/787/snapshot
rm -r /.snapshots/787