BTRFS structure needs cleaning message during full balance?

Hi, after being away from the machine for some time. I noticed balance had quit and the following message came up when attempting a full balance.

:~> sudo btrfs balance start /
WARNING:

        Full balance without filters requested. This operation is very
        intense and takes potentially very long. It is recommended to
        use the balance filters to narrow down the scope of balance.
        Use 'btrfs balance start --full-balance' option to skip this
        warning. The operation will start in 10 seconds.
        Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1
Starting balance without any filters.
ERROR: error during balancing '/': Structure needs cleaning
There may be more info in syslog - try dmesg | tail
hightower-i5-6600k:~> dmesg | tail
[38576.407681] [  T29728] BTRFS info (device dm-2): found 37170 extents, stage: update data pointers
[38584.873805] [  T29728] BTRFS info (device dm-2): relocating block group 64891125760 flags data
[38607.693519] [  T29728] BTRFS info (device dm-2): found 33194 extents, stage: move data extents
[38641.574032] [  T29728] BTRFS info (device dm-2): found 33194 extents, stage: update data pointers
[38649.812477] [  T29728] BTRFS info (device dm-2): relocating block group 62710087680 flags data
[38662.710999] [  T29728] BTRFS info (device dm-2): found 43884 extents, stage: move data extents
[38696.292982] [  T29728] BTRFS info (device dm-2): found 43884 extents, stage: update data pointers
[38708.587669] [  T29728] BTRFS info (device dm-2): relocating block group 60294168576 flags metadata|dup
[38714.889735] [  T29728] BTRFS error (device dm-2): cannot relocate partially dropped subvolume 490, drop progress key (853588 108 0)
[38723.736887] [  T29728] BTRFS info (device dm-2): balance: ended with status: -117
hightower-i5-6600k:~>

I am a bit nervous to attempt to powercycle the machine right now. Are there steps to correct this situation?

You can try

btrfs subvolume sync /

to wait for completion of subvolume removal.

After passing,

:~> sudo btrfs subvolume sync /
[sudo] password for root: 
hightower-i5-6600k:~>

the command returned to prompt very, very quickly. I am unsure if the command was used correctly? Should a balance be attempted once again now?

Well, apparently the background subvolume removal has completed. You can try to start balance again.

A second balance attempt results with the following output:

:~> sudo btrfs balance start /
WARNING:

        Full balance without filters requested. This operation is very
        intense and takes potentially very long. It is recommended to
        use the balance filters to narrow down the scope of balance.
        Use 'btrfs balance start --full-balance' option to skip this
        warning. The operation will start in 10 seconds.
        Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1
Starting balance without any filters.
ERROR: error during balancing '/': Structure needs cleaning
There may be more info in syslog - try dmesg | tail
hightower-i5-6600k:~>
:~> dmesg | tail
[93689.781162] [  T69656] BTRFS info (device dm-2): found 16 extents, stage: update data pointers
[93690.667290] [  T69656] BTRFS info (device dm-2): relocating block group 1495819878400 flags data
[93703.323923] [  T69656] BTRFS info (device dm-2): found 33 extents, stage: move data extents
[93705.575991] [  T69656] BTRFS info (device dm-2): found 33 extents, stage: update data pointers
[93706.769453] [  T69656] BTRFS info (device dm-2): relocating block group 1494746136576 flags data
[93725.570642] [  T69656] BTRFS info (device dm-2): found 39 extents, stage: move data extents
[93727.449779] [  T69656] BTRFS info (device dm-2): found 39 extents, stage: update data pointers
[93728.465650] [  T69656] BTRFS info (device dm-2): relocating block group 60294168576 flags metadata|dup
[93736.722689] [  T69656] BTRFS error (device dm-2): cannot relocate partially dropped subvolume 490, drop progress key (853588 108 0)
[93750.594559] [  T69656] BTRFS info (device dm-2): balance: ended with status: -117
hightower-i5-6600k:~>

What are your thoughts on this?

I had a similar problem a while back which was solved by using patch for btrfs check. AFAICT this patch is still not merged into btrfsprogs and as it was over three years ago I have no idea if this patch still applies. You can ask on the linux-btrfs mailing list and reference the Used space twice as actually consumed.

Apart from fixing it the open question is how and why it happened now. The bug leading to the problem in the mentioned case was in rather old kernel versions.

Thanks, I have looked at this link provided above. I have limited understanding. Can you explain some about how this pertains to the problem at hand?

I believe this may be an older ongoing situation with this BTRFS filesystem. The original OS was installed in 2020 era. Eventually the entire drives contents were cloned with Clonezilla ( from a smaller mechanical drive to a larger SSD). Then once on the larger SSD drive the logical volume was resized (expanded).

I was told on the IRC BTRFS channel a while back that using Clonezilla with BTRFS can cause issues. I have uncertain hard proof of this though, perhaps you can input on this? Asking on Clonezilla SourceForge resulted in no comments thus far. Clonezilla / Discussion / Clonezilla live: btrfs balance error possibly due to partclone..

Also some time ago I filed a bug report on Bugzilla in relation to this particular BTRFS filesystem coredumping when btrfs --check is used. 1219539 – btrfs --check result in a coredump I have considered removing the report due to inactivity. I do not know what to make of it but it may be of help decipher the current problem?

A failing balance points to real trouble. You may want to save all data and create a new file system:

“Btrfs check --repair has no ability to handle this particular and complex corruption.”

https://bugzilla.opensuse.org/show_bug.cgi?id=1228937

Which has nothing to do with this topic.

Symptoms are the same - stuck subvolume removal. I was provided the patch for btrfsprogs that completed removal. If your case is the same, it would be additional argument to merge this patch.

This web address below is the correct linux-btrfs mailing list to send to?
linux-btrfs@vger.kernel.org

I have added the following information for reference. I have sent the following to the linux-btrfs@vger.kernel.org mailing list.

----------------------------Beginning of email------------------------------------
I am having problems when running a full balance on a single SSD.

:~> sudo btrfs balance start /
WARNING:

        Full balance without filters requested. This operation is very
        intense and takes potentially very long. It is recommended to
        use the balance filters to narrow down the scope of balance.
        Use 'btrfs balance start --full-balance' option to skip this
        warning. The operation will start in 10 seconds.
        Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1
Starting balance without any filters.
ERROR: error during balancing '/': Structure needs cleaning
There may be more info in syslog - try dmesg | tail
hightower-i5-6600k:~> dmesg | tail
[38576.407681] [  T29728] BTRFS info (device dm-2): found 37170 extents, stage: update data pointers
[38584.873805] [  T29728] BTRFS info (device dm-2): relocating block group 64891125760 flags data
[38607.693519] [  T29728] BTRFS info (device dm-2): found 33194 extents, stage: move data extents
[38641.574032] [  T29728] BTRFS info (device dm-2): found 33194 extents, stage: update data pointers
[38649.812477] [  T29728] BTRFS info (device dm-2): relocating block group 62710087680 flags data
[38662.710999] [  T29728] BTRFS info (device dm-2): found 43884 extents, stage: move data extents
[38696.292982] [  T29728] BTRFS info (device dm-2): found 43884 extents, stage: update data pointers
[38708.587669] [  T29728] BTRFS info (device dm-2): relocating block group 60294168576 flags metadata|dup
[38714.889735] [  T29728] BTRFS error (device dm-2): cannot relocate partially dropped subvolume 490, drop progress key (853588 108 0)
[38723.736887] [  T29728] BTRFS info (device dm-2): balance: ended with status: -117
hightower-i5-6600k:~>

After passing,

:~> sudo btrfs subvolume sync /
[sudo] password for root: 
hightower-i5-6600k:~>

the command returned to prompt very, very quickly.

A second balance attempt results with the following output:

:~> sudo btrfs balance start /
WARNING:

        Full balance without filters requested. This operation is very
        intense and takes potentially very long. It is recommended to
        use the balance filters to narrow down the scope of balance.
        Use 'btrfs balance start --full-balance' option to skip this
        warning. The operation will start in 10 seconds.
        Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1
Starting balance without any filters.
ERROR: error during balancing '/': Structure needs cleaning
There may be more info in syslog - try dmesg | tail
hightower-i5-6600k:~>
:~> dmesg | tail
[93689.781162] [  T69656] BTRFS info (device dm-2): found 16 extents, stage: update data pointers
[93690.667290] [  T69656] BTRFS info (device dm-2): relocating block group 1495819878400 flags data
[93703.323923] [  T69656] BTRFS info (device dm-2): found 33 extents, stage: move data extents
[93705.575991] [  T69656] BTRFS info (device dm-2): found 33 extents, stage: update data pointers
[93706.769453] [  T69656] BTRFS info (device dm-2): relocating block group 1494746136576 flags data
[93725.570642] [  T69656] BTRFS info (device dm-2): found 39 extents, stage: move data extents
[93727.449779] [  T69656] BTRFS info (device dm-2): found 39 extents, stage: update data pointers
[93728.465650] [  T69656] BTRFS info (device dm-2): relocating block group 60294168576 flags metadata|dup
[93736.722689] [  T69656] BTRFS error (device dm-2): cannot relocate partially dropped subvolume 490, drop progress key (853588 108 0)
[93750.594559] [  T69656] BTRFS info (device dm-2): balance: ended with status: -117
hightower-i5-6600k:~>

Please see this reference prior post for possible stuck subvolume removal similarity. Attached arvidjaar’s hyperlink from post #7

Is there a patch for btrfsprogs? If so can the patch be merged?

What are your thoughts on this?

----------------------------------end of email-----------------------------

Email response from btrfs mailing list.

> :~> sudo dmesg | tail
>
> [sudo] password for root:
>
> [44928.672213] [ T96240] BTRFS info (device dm-2): found 55680 extents, stage: update data pointers
>
> [44937.810972] [ T96240] BTRFS info (device dm-2): found 4 extents, stage: update data pointers
>
> [44938.590658] [ T96240] BTRFS info (device dm-2): relocating block group 60294168576 flags metadata|dup
>
> [44945.516661] [ T96240] BTRFS error (device dm-2): cannot relocate partially dropped subvolume 490, drop progress key (853588 108 0)
>
> [44955.995468] [ T96240] BTRFS info (device dm-2): balance: ended with status: -117
>
> :~>
>
>
>>
>> Along with the kernel version.
> Most current openSUSE Rescue system CD used for btrfs check, uname -a > 6.17.7-1
>>
>> The relocation is rejected because there is a half-dropped subvolume, which is not that common.
>> It maybe a problem with the fs that there are some ghost subvolumes that are never dropped.
>>
>> There used to be kernel bug that can lead to such ghost subvolumes, IIRC the latest btrfs check can detect it.
>>
>> So please also provide the output of "btrfs check --readonly" of the unmounted fs.
>
> :~ # btrfs check --readonly --progress /dev/mapper/system-root
>
> Opening filesystem to check...
>
> Checking filesystem on /dev/mapper/system-root
>
> UUID: 605560ad-fe93-4d09-8760-df0725b43ee1
>
> [1/8] checking log skipped (none written)
>
> [1/7] checking root items (0:00:14 elapsed, 5328460 items checked)
>
> [2/7] checking extents (0:01:01 elapsed, 984830 items checked)
>
> [3/7] checking free space cache (0:00:12 elapsed, 471 items checked)
>
> [4/7] checking fs roots (0:04:32 elapsed, 910644 items checked)
>
> [5/7] checking csums (without verifying data) (0:00:12 elapsed, 895024 items checked)
>
> fs tree 490 missing orphan item (0:00:00 elapsed, 94 items checked)

So indeed your fs has a half dropped ghost subvolume, most likely caused by some older kernels.

Unfortunately btrfs-progs doesn't have the ability to repair it yet, I'll craft a branch of btrfs-progs with the repair ability soon.

Meanwhile please prepare an environment to compile btrfs-progs.