noticed the following strange behavior with BTRFS and Snapper (opensuse tumbleweed with kernel 4.5.2-1-default)
If I select a previous write only snapshot to boot, and subsequently initiate a rollback to this this state (say snapshot id= 465),
btrfs sub get-default / correctly points to snapshot id 465 after the next reboot.
Subsequent snapper images images will be numbered 465+i as expected, but btrfs sub get-default / will still point to snapshot 465.
Upon closer inspection, I found that the actual subvolume mounted is the correct one, however snapper is unable to delete the actual snapshot 465. As such, snappers cleanup algorithm fails, and start to accumulate snapshots until the disk is full.
This problem only happens after a rollback operation. It is reproducible every time.
So I have two questions here:
is there a way to adjust the subvolume numbering manually
When you rollback to an image, a new snapshot instance is created moving forward.
No snapshots are deleted/removed, ever unless manually or by script.
This means that… Let’s say you rolled back to a snapshot but found it was too early. You can still rollback to any other created snapshot forwards or backwards from that point and every time you do so, you’re incrementing a new snapshot each time. You may have rolled back, but you’re actually at the <equivalent> but with a current date/time and a new ID.
I get your point (I think) but that that does not explain (unless I fail to see the obvious) why the snapshot id pointed to by btrfs sub get-default is out of sync with the snapshot id listed by snapper list.
But for the sake of the argument, lets assume that this is the intended behavior, it completely breaks the cleanup algorithm of snapper, since snapper is unable to delete snapshot id=465 (the one reported by btrfs sub get default), all the while it is happily creating new snapshots. So within a few months you have hundreds of new snapshots since snapper fails to cleanup anything newer then snapshot id=465
I don’t remember the snapper auto-delete algorithm off the top of my head, but it’s supposed to be based on time, and is supposed to preserve something like once a day and once every 10 days or something like that.
In any case,
You can manually remove any snapshots you don’t want, are you sing the YAST GUI or the console command?
I find the running snapper in a console is far more powerful and useful than the GUI because snapper commands are so simple to run, and displayed info (like listing snapshots) is very informative.
Removing snapshots using snapper is very easy and ensures that you can’t make a mistake. When you remove a snapshot, all your remaining snapshots are re-compiled(? - Is that the right word?) to preserve each snapshot’s integrity.
Ah that’s precisely my point. I discovered this problem because the auto clean up script failed. next I tried to delete manually (snapper delete XXX) but this fails too with the message that snapper cannot delete snapshot XXX (and it shouldn’t since it is the actual working copy)
so snapper list will give you
------±----±------±--------------------------------±-----±--------±----------------------±-------------
single | 0 | | | root | | current |
single | 1 | | Tue 01 Mar 2016 01:15:31 PM EST | root | number | first root filesystem |
single | 2 | | Tue 01 Mar 2016 01:21:40 PM EST | root | number | after installation | important=yes
pre | 465 | | Wed 18 May 2016 10:49:18 AM EDT | root | number | zypp(zypper) | important=no
post | 466 | 465 | Wed 18 May 2016 10:50:45 AM EDT | root | number | | important=no
pre | 467 | | Wed 18 May 2016 11:00:09 AM EDT | root | number | zypp(zypper) | important=no
whereas btrfs sub get-default / yields
ID 868 gen 144978 top level 257 path .snapshots/475/snapshot
[FONT=arial]so snapper delete will result in “[/FONT]Deleting snapshot failed.” So basically, brtfs sub-get default points to 475 as the youngest snapshot, whereas snapper thinks that 467 is the youngest snapshot (based on id and date).
When you ask about system behavior you should provide as precise information as possible, ideally copy and paste exact command(s) you type and output you get from them; or exact log lines that demonstrate issue. Your XXX cannot be associated with any subvolume you listed and use useless for troubleshooting.
You go to https://bugzilla.opensuse.org/enter_bug.cgi?product=openSUSE%20Tumbleweed using same account as on these forums; select Basesystem as component. Give as precise description as possible; attach actual log files that include error; show actual command with real numbers and its output; tell exact number of snapshot that is in question. Reporting bug with all those XXX and 465+i is just a waste of time and bandwidth.