One possible cause (and solution) for inexplicable excessive disk usage with snapper (BTRFS)

kasi042 · January 6, 2022, 12:00am

Hi all,

I’m not sure if this is the right place. If not, Mods, please move it to somewhere more suitable, chit-chat if you like. IIRC there’s others with inexplicable high disk usage using BTRFS. So I thought I should share this experience as I just had a moment with snapper:

On my laptop I have (only) reserved 40G for my / using BTRFS with snapper. The machine is running well but diskspace kept feeling very tight since quite a while. So I thoroughly keep looking at all temp, cache and log folders etc. and not having too many snapshots.
Especially before my recent upgrade to 15.3 on New Year’s Day I did that and removed all snapshots with YaST exect my “safe points”. Upgrade went smoothly, just after the consecutive zypper up I ran out of space and just got it finished to the point. Again, I ran through my routine clearing space, now again I was at 92% usage.
Then I went into /.snapshot to have a look. There was a folder with a number (1174) which is supposed to be a snapshot number but snapper couldn’t list nor delete it. I found the file “info.xml” empty in that folder. As I just learned, it’s supposed to carry the info of the snapshot for snapper to identify it. So I copied some config from another snapshot and edited it to fit the actual info.
Still, to no avail I did try to list or remove it with snapper. It made me crazy and I was just about to open a thread and ask for help. After all, that was 12G worth of quite old unwanted data. Wait, old? I found it was very old, more than two years and especially older than my currently mounted snapshot (1887). That number, BTW, came from some rollback as “writable copy of 1879”. So, as a last try I renamed to folder to 1900 i.e. a higher number than 1887 (but still lower than my latest pair) and edited the xml-file to fake a suitable info and date.
That did the trick. rotfl! “snapper list”, “snapper rm 1900” and I was happy. Disk usage went down to 41%. That’s fine by me.

In a nutshell:

There might be a mass of data in /.snapshots not found by snapper due to the described culprit.
It seems snapper can’t find snapshots with numbers lower than the one in use. (Experts here might know this, of course. I didn’t.) In case of “snapper rollback” better check for older snapshots afterwards.

A gentle warning and disclaimer to everybody out there even less experienced than I am:
Please handle such things with care and don’t mess around without knowing what you are doing. I actually made a backup of these 12G of junk to my home folder (outside BTRFS, of course) - just in case. I was tempted just to delete the folder but I don’t know if that is wise. I wanted to let snapper do that work in order to let it sort all the settings out for itself. And of course there may be many other reasons for similar symptoms.

So, rather come here and ask for help if you are not sure!

Have a lot of fun!

kasi

malcolmlewis · January 6, 2022, 12:40am

kasi042:

Hi all,

I’m not sure if this is the right place. If not, Mods, please move it to somewhere more suitable, chit-chat if you like. IIRC there’s others with inexplicable high disk usage using BTRFS. So I thought I should share this experience as I just had a moment with snapper:

On my laptop I have (only) reserved 40G for my / using BTRFS with snapper. The machine is running well but diskspace kept feeling very tight since quite a while. So I thoroughly keep looking at all temp, cache and log folders etc. and not having too many snapshots.
Especially before my recent upgrade to 15.3 on New Year’s Day I did that and removed all snapshots with YaST exect my “safe points”. Upgrade went smoothly, just after the consecutive zypper up I ran out of space and just got it finished to the point. Again, I ran through my routine clearing space, now again I was at 92% usage.
Then I went into /.snapshot to have a look. There was a folder with a number (1174) which is supposed to be a snapshot number but snapper couldn’t list nor delete it. I found the file “info.xml” empty in that folder. As I just learned, it’s supposed to carry the info of the snapshot for snapper to identify it. So I copied some config from another snapshot and edited it to fit the actual info.
Still, to no avail I did try to list or remove it with snapper. It made me crazy and I was just about to open a thread and ask for help. After all, that was 12G worth of quite old unwanted data. Wait, old? I found it was very old, more than two years and especially older than my currently mounted snapshot (1887). That number, BTW, came from some rollback as “writable copy of 1879”. So, as a last try I renamed to folder to 1900 i.e. a higher number than 1887 (but still lower than my latest pair) and edited the xml-file to fake a suitable info and date.
That did the trick. rotfl! “snapper list”, “snapper rm 1900” and I was happy. Disk usage went down to 41%. That’s fine by me.

In a nutshell:

There might be a mass of data in /.snapshots not found by snapper due to the described culprit.

It seems snapper can’t find snapshots with numbers lower than the one in use. (Experts here might know this, of course. I didn’t.) In case of “snapper rollback” better check for older snapshots afterwards.

A gentle warning and disclaimer to everybody out there even less experienced than I am:
Please handle such things with care and don’t mess around without knowing what you are doing. I actually made a backup of these 12G of junk to my home folder (outside BTRFS, of course) - just in case. I was tempted just to delete the folder but I don’t know if that is wise. I wanted to let snapper do that work in order to let it sort all the settings out for itself. And of course there may be many other reasons for similar symptoms.

So, rather come here and ask for help if you are not sure!

Have a lot of fun!

kasi

Hi
Did you follow the upgrade note on btrfs and ensuring /var/cache was created?

karlmistelberger · January 6, 2022, 7:24am

Disk usage:

**erlangen:~ #** btrfs filesystem usage -T /            
Overall: 
    Device size:                 464.63GiB 
    Device allocated:            401.04GiB 
**    Device unallocated:           63.59GiB **
    Device missing:                  0.00B 
    Used:                        385.58GiB 
    Free (estimated):             77.11GiB      (min: 77.11GiB) 
    Free (statfs, df):            77.11GiB 
    Data ratio:                       1.00 
    Metadata ratio:                   1.00 
    Global reserve:              512.00MiB      (used: 0.00B) 
    Multiple profiles:                  no 

                  Data      Metadata System               
Id Path           single    single   single   Unallocated 
-- -------------- --------- -------- -------- ----------- 
 1 /dev/nvme0n1p3 397.01GiB  4.00GiB 32.00MiB    63.59GiB 
-- -------------- --------- -------- -------- ----------- 
   Total          397.01GiB  4.00GiB 32.00MiB    63.59GiB 
   Used           383.49GiB  2.09GiB 64.00KiB             
**erlangen:~ #**

Subvolumes:

**erlangen:~ #** btrfs subvolume list -t / 
ID      gen     top level       path 
--      ---     ---------       ---- 
256     247383  5               @ 
257     276139  256             @/var 
258     276135  256             @/usr/local 
259     174930  256             @/tmp 
260     272922  256             @/srv 
261     276152  256             @/root 
262     275912  256             @/opt 
263     261875  256             @/boot/grub2/x86_64-efi 
264     252969  256             @/boot/grub2/i386-pc 
265     275623  256             @/.snapshots 
**2131    276155  256             @/home **
2503    256303  265             @/.snapshots/1740/snapshot 
...
2562    275127  265             @/.snapshots/1793/snapshot 
**erlangen:~ #**

Referenced and exclusive space:

[FONT=monospace]**erlangen:~ #** btrfs qgroup show /      
qgroupid         rfer         excl  
--------         ----         ----  
0/5          16.00KiB     16.00KiB  
0/256        16.00KiB     16.00KiB  
0/257         1.77GiB      1.77GiB  
0/258        24.00KiB     24.00KiB  
0/259        16.00KiB     16.00KiB  
0/260         2.20MiB      2.20MiB  
0/261        53.36MiB     53.36MiB  
0/262       510.38MiB    510.38MiB  
0/263         3.83MiB      3.83MiB  
0/264        16.00KiB     16.00KiB  
0/265         1.37MiB      1.37MiB  
**0/2131      357.89GiB    357.89GiB  **
0/2503       11.91GiB     79.33MiB  
...
0/2562       11.90GiB      7.34MiB  
**erlangen:~ #**[/FONT]

See also: https://forums.opensuse.org/showthread.php/562626-Snapper-Funktionsweise

kasi042 · January 6, 2022, 9:38pm

Hi guys,

Maybe that was another kasi-case of too much information. :rolleyes:
Malcolm, thanks for the hint. I follow these instructions with each online upgrade. I have a separate subvolume @/var which should be ok as I understand. And I made sure there’s no excessive logs and the zypper cache was “just” a few hundred MB. No, that (very close) “disk full issue” was just my stupidity. I shouldn’t have mentioned it. >:)

Karl, thanks. I know that thread, I have posted there. And I stand by what I said: It does make sense to delete snapshots to free space. Because that’s what I’ve seen: Immediately after “znapper rm 1900” disk usage went down from 72% to 41%. If I say immediately, that’s about 3-5 seconds for being happy it worked plus typing

 # df /
Filesystem     1K-blocks     Used Available Use% Mounted on
/dev/sdb2       41943040 17229344  23533296  43% /

But apparently it doesn’t always work that fast:

**luto:~ #** btrfs subvolume list -t /    
ID      gen     top level       path 
--      ---     ---------       ---- 
257     75547   5               @ 
258     145911  257             @/var 
259     145894  257             @/usr/local 
260     145902  257             @/tmp 
261     145894  257             @/srv 
262     145895  257             @/root 
263     145894  257             @/opt 
264     145894  257             @/boot/grub2/x86_64-efi 
265     145894  257             @/boot/grub2/i386-pc 
266     145902  257             @/.snapshots 
2582    145909  266             @/.snapshots/1887/snapshot 
2671    145836  266             @/.snapshots/1916/snapshot 
2672    145847  266             @/.snapshots/1917/snapshot 
2675    145865  266             @/.snapshots/1920/snapshot 
2676    145868  266             @/.snapshots/1921/snapshot 
**pluto:~ #** df / 
Filesystem     1K-blocks     Used Available Use% Mounted on 
/dev/sdb2       41943040 17229344  23533296  43% / 
**pluto:~ #** snapper rm 1916-1917 
**pluto:~ #** df / 
Filesystem     1K-blocks     Used Available Use% Mounted on 
/dev/sdb2       41943040 17229344  23533296  43% / 
**pluto:~ #** df / 
Filesystem     1K-blocks     Used Available Use% Mounted on 
/dev/sdb2       41943040 17229344  23533296  43% / 
**pluto:~ #** df / 
Filesystem     1K-blocks     Used Available Use% Mounted on 
/dev/sdb2       41943040 17229344  23533296  43% / 
**pluto:~ #** df / 
Filesystem     1K-blocks     Used Available Use% Mounted on 
/dev/sdb2       41943040 16170704  24564912  40% / 
**pluto:~ #**

The last df / was a couple of minutes after deleting that pair of snapshots. I guess my originally described experience was especially caused by the nature of that snapshot. I have already done a couple of rollbacks and I assume that it initially was a writable copy thus being the full set of files (which I could browse just as my / ).
But all that was not the point of my thread. The point was that a large chunk of data simply sat in /.snapshots unrecognized by snapper due to the empty info.xml. I don’t know how it happened. But if it happened here it might have happened elsewhere.
I considered a bug report but how and why? By deleting that snapshot I erased all evidence and since that folder was from 2019, who knows what has happened and which version of snapper might have been involved.

karlmistelberger · January 6, 2022, 11:34pm

Sure. However post #1 is murky due to missing command output.

Deletion of a snapshot will free exclusive space, #84 holds exclusive 930.85MiB, which will be freed upon deletion. Command “snapper list” will miss some subvolumes. For listing all subvolumes always use “btrfs qgroup show”.

**erlangen:~ #**  btrfs qgroup show /  
qgroupid         rfer         excl  
--------         ----         ----  
0/5          16.00KiB     16.00KiB  
0/256        16.00KiB     16.00KiB  
0/257         2.20GiB      2.20GiB  
0/258        80.00KiB     80.00KiB  
0/259         1.99MiB      1.99MiB  
0/260        32.29MiB     32.29MiB  
0/261       568.50MiB    568.50MiB  
0/262       310.96GiB    310.96GiB  
0/263         3.83MiB      3.83MiB  
0/264        16.00KiB     16.00KiB  
0/265         2.91MiB      2.91MiB  
0/266        12.17GiB      7.77MiB  
0/398        12.13GiB    **930.85MiB**  
...
0/424        12.17GiB      2.30MiB  
**erlangen:~ #**

**erlangen:~ #** btrfs subvolume list -t / 
ID      gen     top level       path 
--      ---     ---------       ---- 
256     129     5               @ 
257     48170   256             @/var 
258     47720   256             @/usr/local 
259     46612   256             @/srv 
260     47734   256             @/root 
261     47510   256             @/opt 
262     48170   256             @/home 
263     47504   256             @/boot/grub2/x86_64-efi 
264     40972   256             @/boot/grub2/i386-pc 
265     47654   256             @/.snapshots 
266     48167   265             @/.snapshots/1/snapshot 
398     40655   265            ** @/.snapshots/84/snapshot **
...
424     47539   265             @/.snapshots/109/snapshot 
**erlangen:~ #**

Data shared between several snapshots will not be freed upon deletion of one snapshot. Freeing the data requires deletion of all snapshots sharing the data. If the default subvolume shares the data “snapper rollback” is required.

kasi042 · January 7, 2022, 12:08am

Correct, as there wasn’t a lot to share. I am not very fast nor sophisticated in the terminal so I have been trying around with several tools beside konsole. Eventually I found a folder browsing with dolphin and renaming it there, edited the xml file with emacs, listed snapshots with snapper and YaST, did other rather useless things and later on posted here in retrospective.
The most important command “zypper rm 1900” (luckily) didn’t produce any output as it finished successfully. But there’s another one I reliably recall and I can reproduce anytime. The output was the same as now:

**pluto:~ #** snapper rm 1174 
Snapshot '1174' not found.

Although a folder with the same number existed blocking 12G of space. Point made, the rest is up in post 1.

deano_ferrari · January 7, 2022, 12:27am

This thread will be moved from “Unreviewed How To and FAQ” to “Install/Boot/Login” as it is not really delivered as a concise tutorial or guide as much as a technical discussion with some further knowledge/advice sharing added.

karlmistelberger · January 7, 2022, 7:00am

When everything works use the graphical interface. When something is broken use the command line to dig deeper.

Eventually I found a folder browsing with dolphin and renaming it there, edited the xml file with emacs, listed snapshots with snapper and YaST, did other rather useless things and later on posted here in retrospective.

This creates more complexity, but doesn’t solve the problem beneath.

The most important command “zypper rm 1900” (luckily) didn’t produce any output as it finished successfully.

Are you really sure?

But there’s another one I reliably recall and I can reproduce anytime. The output was the same as now:

**pluto:~ #** snapper rm 1174 
Snapshot '1174' not found.

Do you want to solve a problem? Don’t narrate. Run all of the following on your machine. Post the complete information, the exact command and the complete output as I have done above. Use copy and paste. No sophistication required:

snapper list
btrfs subvolume list -t /
btrfs qgroup show -p /

kasi042 · January 7, 2022, 8:43pm

Yes!
Exactly the same output as the of “snapper rm 1916-1917” here:
Post #4

Nope!
Please read first post, first paragraph!

I just wanted to share what I have found as I think it may be useful for others. I’d summarize like this:

If one experiences unusual disk usage which seemingly can not be explained, or solved it may be a good idea to check if numbers listed with “snapper list” correlate with the numbers of the folders in /.snapshots . If one does not experience such issue one may just as well ignore this.

As I wasn’t asking for help I didn’t start the thread in this sub forum - but yes, that wasn’t really a suitable “how-to”. Thanks, Deano.

Well, here you go:

**pluto:~ #** snapper list 
    # | Type   | Pre # | Date                     | User | Used Space | Cleanup | Description            | Userdata     
------+--------+-------+--------------------------+------+------------+---------+------------------------+------------- 
   0  | single |       |                          | root |            |         | current                |              
1887* | single |       | Sat Jan  1 22:16:19 2022 | root | 234.25 MiB |         | writable copy of #1879 |              
1920  | pre    |       | Thu Jan  6 00:27:46 2022 | root |  94.77 MiB | number  | zypp(zypper)           | important=no 
1921  | post   |  1920 | Thu Jan  6 00:28:07 2022 | root |  11.18 MiB | number  |                        | important=no 
1922  | pre    |       | Fri Jan  7 00:22:15 2022 | root | 336.00 KiB | number  | zypp(zypper)           | important=no 
1923  | post   |  1922 | Fri Jan  7 00:22:19 2022 | root | 512.00 KiB | number  |                        | important=no 
**pluto:~ #** btrfs subvolume list -t / 
ID      gen     top level       path 
--      ---     ---------       ---- 
257     75547   5               @ 
258     146198  257             @/var 
259     146169  257             @/usr/local 
260     146189  257             @/tmp 
261     145894  257             @/srv 
262     146198  257             @/root 
263     146167  257             @/opt 
264     145894  257             @/boot/grub2/x86_64-efi 
265     145894  257             @/boot/grub2/i386-pc 
266     146174  257             @/.snapshots 
2582    146174  266             @/.snapshots/1887/snapshot 
2675    145865  266             @/.snapshots/1920/snapshot 
2676    145868  266             @/.snapshots/1921/snapshot 
2677    146157  266             @/.snapshots/1922/snapshot 
2678    146160  266             @/.snapshots/1923/snapshot 
**pluto:~ #** btrfs qgroup show -p / 
qgroupid         rfer         excl parent   
--------         ----         ---- ------   
0/5          16.00KiB     16.00KiB ---      
0/257        16.00KiB     16.00KiB ---      
0/258         1.08GiB      1.08GiB ---      
0/259        17.28MiB     17.28MiB ---      
0/260       480.00KiB    480.00KiB ---      
0/261       720.00KiB    720.00KiB ---      
0/262       108.70MiB    108.70MiB ---      
0/263       358.36MiB    358.36MiB ---      
0/264        16.00KiB     16.00KiB ---      
0/265         2.48MiB      2.48MiB ---      
0/266        64.00KiB     64.00KiB ---      
0/2582       13.27GiB    234.25MiB ---      
0/2675       13.21GiB     94.77MiB 1/0      
0/2676       13.21GiB     11.18MiB 1/0      
0/2677       13.21GiB    336.00KiB 1/0      
0/2678       13.21GiB    512.00KiB 1/0      
1/0          13.76GiB    743.69MiB ---      
**pluto:~ #** df / 
Filesystem     1K-blocks     Used Available Use% Mounted on 
/dev/sdb2       41943040 16424144  24330576  41% / 
**pluto:~ #**

We learn that snapper has provided me with a nice new pre/post set of snapshots which physically use just 848KiB.

karlmistelberger · January 8, 2022, 7:19am

Jumping back and forth isn’t easy reading.

Well, here you go:

**pluto:~ #** snapper list 
    # | Type   | Pre # | Date                     | User | Used Space | Cleanup | Description            | Userdata     
------+--------+-------+--------------------------+------+------------+---------+------------------------+------------- 
   0  | single |       |                          | root |            |         | current                |              
1887* | single |       | Sat Jan  1 22:16:19 2022 | root | 234.25 MiB |         | writable copy of #1879 |              
1920  | pre    |       | Thu Jan  6 00:27:46 2022 | root |  94.77 MiB | number  | zypp(zypper)           | important=no 
1921  | post   |  1920 | Thu Jan  6 00:28:07 2022 | root |  11.18 MiB | number  |                        | important=no 
1922  | pre    |       | Fri Jan  7 00:22:15 2022 | root | 336.00 KiB | number  | zypp(zypper)           | important=no 
1923  | post   |  1922 | Fri Jan  7 00:22:19 2022 | root | 512.00 KiB | number  |                        | important=no 
**pluto:~ #** btrfs subvolume list -t / 
ID      gen     top level       path 
--      ---     ---------       ---- 
257     75547   5               @ 
258     146198  257             @/var 
259     146169  257             @/usr/local 
260     146189  257             @/tmp 
261     145894  257             @/srv 
262     146198  257             @/root 
263     146167  257             @/opt 
264     145894  257             @/boot/grub2/x86_64-efi 
265     145894  257             @/boot/grub2/i386-pc 
266     146174  257             @/.snapshots 
**2582    146174  266             @/.snapshots/1887/snapshot 
2675    145865  266             @/.snapshots/1920/snapshot 
2676    145868  266             @/.snapshots/1921/snapshot 
2677    146157  266             @/.snapshots/1922/snapshot 
2678    146160  266             @/.snapshots/1923/snapshot 
****pluto:~ #** btrfs qgroup show -p / 
qgroupid         rfer         excl parent   
--------         ----         ---- ------   
0/5          16.00KiB     16.00KiB ---      
0/257        16.00KiB     16.00KiB ---      
0/258         1.08GiB      1.08GiB ---      
0/259        17.28MiB     17.28MiB ---      
0/260       480.00KiB    480.00KiB ---      
0/261       720.00KiB    720.00KiB ---      
0/262       108.70MiB    108.70MiB ---      
0/263       358.36MiB    358.36MiB ---      
0/264        16.00KiB     16.00KiB ---      
0/265         2.48MiB      2.48MiB ---      
0/266        64.00KiB     64.00KiB ---      
**0/2582       13.27GiB    234.25MiB ---      
0/2675       13.21GiB     94.77MiB 1/0      
0/2676       13.21GiB     11.18MiB 1/0      
0/2677       13.21GiB    336.00KiB 1/0      
0/2678       13.21GiB    512.00KiB 1/0      
**1/0          13.76GiB    743.69MiB ---      
**pluto:~ #** df / 
Filesystem     1K-blocks     Used Available Use% Mounted on 
/dev/sdb2       41943040 16424144  24330576  41% / 
**pluto:~ #**

Thanks for running the suggested commands. The above shows evidence that you indeed achieved what you presumably were aiming at: Number of subvolumes found in /.snapshots now matches the number of snapshots listed by snapper. Deleting snapshots 1920-1923 would free some 106MiB.

Note: Results shown by “df” and “btrfs filesystem usage” may differ. See also “free space” vs. “unallocated space”: btrfs free space - Google Suche

kasi042 · January 8, 2022, 10:18pm

Hi Karl,

Thanks for confirming!

I actually wonder what these commands would have shown before I have got rid of that “orphaned” folder. After all, it’s why I didn’t want to just delete it but let snapper do it’s work. But I’m always glad to learn.