[btrfs-transacti] heavy disk usage

KDE Tumbleweed user here.

Recently I’ve been experiencing heavy disk usage on my system, to the point where it becomes highly unresponsive and even completely unusable in some situations (it hangs for even some minutes before unfreezing).

I’ve tried to diagnose the cause with iotop (with and without -aoP options), and found that [btrfs-transacti] seems to be the worst offender, iotop shows that it has heavy IO and disk write (almost a GB in a couple of hours of regular usage). Like I said this makes the system highly unresponsive, especially when some swap usage is involved.

Some research (here https://btrfs.wiki.kernel.org/index.php/Gotchas#Fragmentation and here https://superuser.com/q/1211324/170331 ) seems to suggest that it has something to do with btrfs fragmentation, but really, I’m not really sure on how to precisely find what’s the problem here and how to try to solve it.

Someone can help?

This won’t be of much help.

Yes, I have seen similar behavior. In my case, it was with Leap 15.1 running in a VM.

I have not seen this on real hardware, because I always use “ext4”. But I do test “btrfs” on VMs (currently testing Leap 15.2Alpha with “btrfs”, but no problems yet).

Have you upgraded to the latest Tumbleweed snapshot? A seemingly related bug report was closed last month https://bugzilla.opensuse.org/show_bug.cgi?id=1063638

Are you using a SSD or hard drive? Are you using the default partitioning? RAID? What does “btrfs fi usage /” show?

If you search for btrfs-transacti on the forum you’ll see more threads about this. One person claims running defrag helped, another claims disabling quota worked.

https://bugzilla.opensuse.org/show_bug.cgi?id=1111523

It is unclear whether it was recurring event (and if yes, how often and at which intervals it happened) or just one off observation. If observed just once, it could well be related to deletion of large snapshot as example.

Yes, fully updated system here. I update it regularly.

I’m on a laptop, it is equipped with an hard drive (no SSD), and the partition is pretty standard, a btrfs for root and a, ext4 for /home (plus swap and ntfs for windows). No RAID, a simple laptop like I said.

~> sudo btrfs fi usage /
[sudo] password di root: 
Overall:
    Device size:                  53.71GiB
    Device allocated:             48.07GiB
    Device unallocated:            5.64GiB
    Device missing:                  0.00B
    Used:                         38.19GiB
    Free (estimated):             12.47GiB      (min: 9.65GiB)
    Data ratio:                       1.00
    Metadata ratio:                   2.00
    Global reserve:              108.05MiB      (used: 0.00B)

Data,single: Size:42.01GiB, Used:35.18GiB
   /dev/sda5      42.01GiB

Metadata,DUP: Size:3.00GiB, Used:1.50GiB
   /dev/sda5       6.00GiB

System,DUP: Size:32.00MiB, Used:16.00KiB
   /dev/sda5      64.00MiB

Unallocated:
   /dev/sda5       5.64GiB


and

~> cat /etc/fstab 
UUID=13d2c6ed-8a60-48af-867d-cf56543285f8 swap swap defaults 0 0
UUID=ddceb69a-5ebf-45ca-a410-18b17dee347a / btrfs defaults 0 0
UUID=ddceb69a-5ebf-45ca-a410-18b17dee347a /boot/grub2/i386-pc btrfs subvol=@/boot/grub2/i386-pc 0 0
UUID=ddceb69a-5ebf-45ca-a410-18b17dee347a /boot/grub2/x86_64-efi btrfs subvol=@/boot/grub2/x86_64-efi 0 0
UUID=ddceb69a-5ebf-45ca-a410-18b17dee347a /opt btrfs subvol=@/opt 0 0
UUID=ddceb69a-5ebf-45ca-a410-18b17dee347a /srv btrfs subvol=@/srv 0 0
UUID=ddceb69a-5ebf-45ca-a410-18b17dee347a /tmp btrfs subvol=@/tmp 0 0
UUID=ddceb69a-5ebf-45ca-a410-18b17dee347a /usr/local btrfs subvol=@/usr/local 0 0
UUID=ddceb69a-5ebf-45ca-a410-18b17dee347a /var/cache btrfs subvol=@/var/cache 0 0
UUID=ddceb69a-5ebf-45ca-a410-18b17dee347a /var/crash btrfs subvol=@/var/crash 0 0
UUID=ddceb69a-5ebf-45ca-a410-18b17dee347a /var/lib/libvirt/images btrfs subvol=@/var/lib/libvirt/images 0 0
UUID=ddceb69a-5ebf-45ca-a410-18b17dee347a /var/lib/machines btrfs subvol=@/var/lib/machines 0 0
UUID=ddceb69a-5ebf-45ca-a410-18b17dee347a /var/lib/mailman btrfs subvol=@/var/lib/mailman 0 0
UUID=ddceb69a-5ebf-45ca-a410-18b17dee347a /var/lib/mariadb btrfs subvol=@/var/lib/mariadb 0 0
UUID=ddceb69a-5ebf-45ca-a410-18b17dee347a /var/lib/mysql btrfs subvol=@/var/lib/mysql 0 0
UUID=ddceb69a-5ebf-45ca-a410-18b17dee347a /var/lib/named btrfs subvol=@/var/lib/named 0 0
UUID=ddceb69a-5ebf-45ca-a410-18b17dee347a /var/lib/pgsql btrfs subvol=@/var/lib/pgsql 0 0
UUID=ddceb69a-5ebf-45ca-a410-18b17dee347a /var/log btrfs subvol=@/var/log 0 0
UUID=ddceb69a-5ebf-45ca-a410-18b17dee347a /var/opt btrfs subvol=@/var/opt 0 0
UUID=ddceb69a-5ebf-45ca-a410-18b17dee347a /var/spool btrfs subvol=@/var/spool 0 0
UUID=ddceb69a-5ebf-45ca-a410-18b17dee347a /var/tmp btrfs subvol=@/var/tmp 0 0
UUID=ddceb69a-5ebf-45ca-a410-18b17dee347a /.snapshots btrfs subvol=@/.snapshots 0 0
UUID=490f1cc3-4911-45a8-a7f0-a5f4f707554a /home                ext4       defaults              1 2
/dev/sda2 /run/media/stefano/WINDOWS ntfs defaults 0 0
/dev/sda3 /run/media/stefano/DATA ntfs defaults 0 0


You’re getting down to around only 10GB free on the system partition. I try to keep at least 20GB free: My btrfs headaches went away after starting that practice. The new default partition size is 80GB.

You could try purging old kernels and removing the oldest snapshots, then run disk maintenance.

I’d also check for phantom snapshots. See:

https://en.opensuse.org/SDB:Disk_space

https://en.opensuse.org/SDB:Disable_btrfsmaintenance#Performing_manual_maintenance

Aside from the information you already found about fragmentation:

COW (copy-on-write) filesystems have many advantages, but they also have some disadvantages, for example fragmentation. Btrfs lays out the data sequentially when files are written to the disk for first time, but a COW design implies that any subsequent modification to the file must not be written on top of the old data, but be placed in a free block, which will cause fragmentation (RPM databases are a common case of this problem). Aditionally, it suffers the fragmentation problems common to all filesystems.

I’m not sure if you want to run it recursively or not. I’ve only used btrfs with a SSD.

https://btrfs.wiki.kernel.org/index.php/UseCases#How_do_I_defragment_many_files.3F

Based on that, if I was going to try it, I would first delete all but the latest snapshot.

Tried to free some space and perform some cleanup, but the problem persists.

Over an afternoon (5/6 hours use) monitored with “iotop -aoP”, btrfs-transacti had about 2GB of Disk Write. Another heavy disk usage was from systemd-journald.