btrfs and pulseaudio problem

Hi,

I have been running Leap 42.3 for some time. Had Leap 42.1 and upgraded through a new install since there was a step in between (seemed “safest”). Anyway, there are some log entries that seem “serious” and I did not have them with Leap 42.1 so maybe you guys can say something about them. First it’s some sort of kernel crash:


Sep 11 00:58:04 linux-niva kernel: ------------ cut here ]------------
Sep 11 00:58:04 linux-niva kernel: WARNING: CPU: 0 PID: 386 at ../fs/btrfs/qgroup.c:2466 btrfs_qgroup_free_refroot+0x154/0x180 [btrfs]()
Sep 11 00:58:04 linux-niva kernel: Modules linked in: fuse nf_log_ipv6 xt_comment nf_log_ipv4 nf_log_common xt_LOG xt_limit af_packet iscsi_ibft iscsi_boot_sysfs ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 xt_pkttype xt_tcpudp iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables msr ext4 crc16 jbd2 mbcache kvm irqbypass sp5100_tco acpi_cpufreq r8169 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel drbg ansi_cprng mii tpm_infineon aesni_intel snd_hda_codec_via snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel fjes aes_x86_64 snd_hda_codec lrw gf128mul processor wmi k10temp pcspkr snd_hda_core snd_hwdep glue_helper ablk_helper button snd_pcm i2c_piix4
Sep 11 00:58:04 linux-niva kernel:  cryptd snd_timer fam15h_power snd edac_mce_amd edac_core soundcore shpchp hid_generic usbhid btrfs xor raid6_pq uas usb_storage sr_mod cdrom sd_mod ata_generic ohci_pci amdkfd(O) amd_iommu_v2 crc32c_intel radeon(O) i2c_algo_bit serio_raw drm_kms_helper(O) syscopyarea sysfillrect firewire_ohci xhci_pci firewire_core sysimgblt crc_itu_t fb_sys_fops ohci_hcd ehci_pci xhci_hcd ehci_hcd ahci pata_atiixp libahci ttm(O) usbcore libata usb_common drm(O) sg scsi_mod autofs4
Sep 11 00:58:04 linux-niva kernel: CPU: 0 PID: 386 Comm: btrfs-transacti Tainted: G        W  O     4.4.85-22-default #1
Sep 11 00:58:04 linux-niva kernel: Hardware name: Gigabyte Technology Co., Ltd. GA-970A-UD3/GA-970A-UD3, BIOS F6 05/30/2012
Sep 11 00:58:04 linux-niva kernel:  0000000000000000 ffffffff81339d37 0000000000000000 ffffffffa0612b75
Sep 11 00:58:04 linux-niva kernel:  ffffffff81080681 ffff88032021a488 00000000000a4000 ffff88031ff9cd80
Sep 11 00:58:04 linux-niva kernel:  ffff88032021a400 ffff88031ff9c000 ffffffffa05fa0c4 0000000000000103
Sep 11 00:58:04 linux-niva kernel: Call Trace:
Sep 11 00:58:04 linux-niva kernel:  <ffffffff81019f29>] dump_trace+0x59/0x320
Sep 11 00:58:04 linux-niva kernel:  <ffffffff8101a2ea>] show_stack_log_lvl+0xfa/0x180
Sep 11 00:58:04 linux-niva kernel:  <ffffffff8101b091>] show_stack+0x21/0x40
Sep 11 00:58:04 linux-niva kernel:  <ffffffff81339d37>] dump_stack+0x5c/0x85
Sep 11 00:58:04 linux-niva kernel:  <ffffffff81080681>] warn_slowpath_common+0x81/0xb0
Sep 11 00:58:04 linux-niva kernel:  <ffffffffa05fa0c4>] btrfs_qgroup_free_refroot+0x154/0x180 [btrfs]
Sep 11 00:58:04 linux-niva kernel:  <ffffffffa05894fe>] commit_fs_roots.isra.20+0x14e/0x190 [btrfs]
Sep 11 00:58:04 linux-niva kernel:  <ffffffffa058be88>] btrfs_commit_transaction.part.26+0x498/0xad0 [btrfs]
Sep 11 00:58:04 linux-niva kernel:  <ffffffffa05861cb>] transaction_kthread+0x21b/0x280 [btrfs]
Sep 11 00:58:04 linux-niva kernel:  <ffffffff8109f302>] kthread+0xd2/0xf0
Sep 11 00:58:04 linux-niva kernel:  <ffffffff8163180f>] ret_from_fork+0x3f/0x70
Sep 11 00:58:04 linux-niva kernel: DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70
Sep 11 00:58:04 linux-niva kernel: 
Sep 11 00:58:04 linux-niva kernel: Leftover inexact backtrace:
Sep 11 00:58:04 linux-niva kernel:  <ffffffff8109f230>] ? kthread_park+0x50/0x50
Sep 11 00:58:04 linux-niva kernel: --- end trace 9c7d7d072defdce1 ]---
Sep 11 00:58:04 linux-niva kernel: BTRFS warning (device sda2): qgroup 259 reserved space underflow, have: 655360, to free: 671744

As you see it is some btrfs error. These are quite frequent but I can not reproduce them, they just happen, and they seem to be coming more often now. There is no apparent problem as a consequence of these kernel crashes, or whatever they are, nothing gets slower or anything. I have tried running the btrfsmaintenance scripts btrfs-balance and btrfs-scrub but the warnings come anyway. It is a warning so it may not be that bad but they look a bit ominous (hmmm…). Is anybody else seeing these? The system is updated with the recommended patches.

Second problem is an error that comes up in the log after every boot.


Sep 11 00:01:54 linux-niva pulseaudio[15928]: [pulseaudio] bluez5-util.c: GetManagedObjects() failed: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.

The bluez5 is about bluetooth, some googling gives that, but I have no bluetooth on this box and it is disabled in yast so why is it coming up? The sound is ok.

These have been bugging me for some time.

Assuming that, the “root” (’/’) partition is a Btrfs partition then, execute, with the “root” user’s CLI, the Btrfs “housekeeping” scripts located in the directories “/etc/cron.weekly/” and “/etc/cron.monthly/”.

It may pay to run the “monthly” Btrfs “scrub” script before running the “weekly” Btrfs “balance” script.
Please note that, the “scrub” script usually takes a considerable amount of time (more than a few minutes) to execute and, it doesn’t have a progress indicator.

The Btrfs “housekeeping” scripts are normally executed automatically as ‘cron’ batch jobs.
If your system is one which is often powered off, for example a Laptop or, a Desktop which is powered down overnight, then, there’s a chance that the ‘cron’ batch jobs are not being executed as planned.

Yes it is the root partition that has btrfs and it is also a box which is powered off almost every night but as I said I have been running those maintenance scripts manually and the btrfs errors kept coming. Now I tried running the scrub script first and then the balance but it does not seem to help. This is a recent example of the problem:


Sep 11 15:00:36 linux-niva kernel: ------------ cut here ]------------
Sep 11 15:00:36 linux-niva kernel: WARNING: CPU: 5 PID: 10454 at ../fs/btrfs/qgroup.c:2466 btrfs_qgroup_free_refroot+0x154/0x180 [btrfs]()
Sep 11 15:00:36 linux-niva kernel: Modules linked in: fuse nf_log_ipv6 xt_comment nf_log_ipv4 nf_log_common xt_LOG xt_limit af_packet iscsi_ibft iscsi_boot_sysfs ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 xt_pkttype xt_tcpudp iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables msr ext4 crc16 jbd2 mbcache kvm acpi_cpufreq irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel r8169 drbg ansi_cprng aesni_intel mii fjes sp5100_tco aes_x86_64 tpm_infineon processor i2c_piix4 lrw gf128mul snd_hda_codec_via snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep glue_helper snd_pcm k10temp button snd_timer ablk_helper
Sep 11 15:00:36 linux-niva kernel:  pcspkr
Sep 11 15:00:36 linux-niva kernel:  snd cryptd wmi shpchp soundcore edac_mce_amd edac_core fam15h_power btrfs xor raid6_pq hid_generic usbhid sr_mod cdrom sd_mod ata_generic uas usb_storage ohci_pci pata_atiixp amdkfd(O) amd_iommu_v2 crc32c_intel radeon(O) i2c_algo_bit serio_raw drm_kms_helper(O) syscopyarea sysfillrect sysimgblt fb_sys_fops firewire_ohci ttm(O) firewire_core ahci libahci crc_itu_t xhci_pci ohci_hcd ehci_pci xhci_hcd ehci_hcd libata usbcore drm(O) usb_common sg scsi_mod autofs4
Sep 11 15:00:36 linux-niva kernel: CPU: 5 PID: 10454 Comm: kworker/u16:4 Tainted: G        W  O     4.4.85-22-default #1
Sep 11 15:00:36 linux-niva kernel: Hardware name: Gigabyte Technology Co., Ltd. GA-970A-UD3/GA-970A-UD3, BIOS F6 05/30/2012
Sep 11 15:00:36 linux-niva kernel: Workqueue: writeback wb_workfn (flush-btrfs-1)
Sep 11 15:00:36 linux-niva kernel:  0000000000000000 ffffffff81339d37 0000000000000000 ffffffffa0632b75
Sep 11 15:00:36 linux-niva kernel:  ffffffff81080681
Sep 11 15:00:36 linux-niva kernel:  ffff88031fd8fb48 0000000000001000 ffff88018dd3ad80
Sep 11 15:00:36 linux-niva kernel:  ffff88031fd8fac0 ffff88018dd3a000 ffffffffa061a0c4 0000000000000103
Sep 11 15:00:36 linux-niva kernel: Call Trace:
Sep 11 15:00:36 linux-niva kernel:  <ffffffff81019f29>] dump_trace+0x59/0x320
Sep 11 15:00:36 linux-niva kernel:  <ffffffff8101a2ea>] show_stack_log_lvl+0xfa/0x180
Sep 11 15:00:36 linux-niva kernel:  <ffffffff8101b091>] show_stack+0x21/0x40
Sep 11 15:00:36 linux-niva kernel:  <ffffffff81339d37>] dump_stack+0x5c/0x85
Sep 11 15:00:36 linux-niva kernel:  <ffffffff81080681>] warn_slowpath_common+0x81/0xb0
Sep 11 15:00:36 linux-niva kernel:  <ffffffffa061a0c4>] btrfs_qgroup_free_refroot+0x154/0x180 [btrfs]
Sep 11 15:00:36 linux-niva kernel:  <ffffffffa061a20c>] __btrfs_qgroup_release_data+0x11c/0x130 [btrfs]
Sep 11 15:00:36 linux-niva kernel:  <ffffffffa05b3101>] cow_file_range_inline+0x4c1/0x730 [btrfs]
Sep 11 15:00:36 linux-niva kernel:  <ffffffffa05b3691>] cow_file_range.isra.59+0x321/0x4d0 [btrfs]
Sep 11 15:00:36 linux-niva kernel:  <ffffffffa05b4552>] run_delalloc_range+0x102/0x3f0 [btrfs]
Sep 11 15:00:36 linux-niva kernel:  <ffffffffa05cd104>] writepage_delalloc.isra.40+0xf4/0x140 [btrfs]
Sep 11 15:00:36 linux-niva kernel:  <ffffffffa05ce08e>] __extent_writepage+0xbe/0x2d0 [btrfs]
Sep 11 15:00:36 linux-niva kernel:  <ffffffffa05ce4e1>] extent_write_cache_pages.isra.36.constprop.50+0x241/0x3b0 [btrfs]
Sep 11 15:00:36 linux-niva kernel:  <ffffffffa05d020d>] extent_writepages+0x4d/0x60 [btrfs]
Sep 11 15:00:36 linux-niva kernel:  <ffffffff8123cbbd>] __writeback_single_inode+0x3d/0x370
Sep 11 15:00:36 linux-niva kernel:  <ffffffff8123d3f3>] writeback_sb_inodes+0x233/0x4f0
Sep 11 15:00:36 linux-niva kernel:  <ffffffff8123d731>] __writeback_inodes_wb+0x81/0xb0
Sep 11 15:00:36 linux-niva kernel:  <ffffffff8123d9da>] wb_writeback+0x27a/0x310
Sep 11 15:00:36 linux-niva kernel:  <ffffffff8123e1d0>] wb_workfn+0x2b0/0x3d0
Sep 11 15:00:36 linux-niva kernel:  <ffffffff810991a5>] process_one_work+0x155/0x440
Sep 11 15:00:36 linux-niva kernel:  <ffffffff81099cf6>] worker_thread+0x116/0x4b0
Sep 11 15:00:36 linux-niva kernel:  <ffffffff8109f302>] kthread+0xd2/0xf0
Sep 11 15:00:36 linux-niva kernel:  <ffffffff8163180f>] ret_from_fork+0x3f/0x70
Sep 11 15:00:36 linux-niva kernel: DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70
Sep 11 15:00:36 linux-niva kernel: 
Sep 11 15:00:36 linux-niva kernel: Leftover inexact backtrace:
Sep 11 15:00:36 linux-niva kernel:  <ffffffff8109f230>] ? kthread_park+0x50/0x50
Sep 11 15:00:36 linux-niva kernel: --- end trace 1085b8437ed4d854 ]---
Sep 11 15:00:36 linux-niva kernel: BTRFS warning (device sda2): qgroup 259 reserved space underflow, have: 0, to free: 4096

The trace looks different. Is this crashing some built-in characteristic with btrfs, and a bunch of maintenance scripts is needed to keep it in shape (but they are not doing the job)? Since the Leap 42.3 installation wanted to use btrfs on root shouldn’t it be a stable filesystem? These errors were not coming on Leap 42.1. Is this only happening on my machine?

Looking at the trace supplied, it seems that the mainboard is a Gigabyte GA-970A-UD3 (qualified for Windows 7) with a BIOS (F6) dated 05/30/2012.

  • There is a later (F7) BIOS dated 2012.10.22 available from the Gigabyte web site.

The even newer ‘F8f’ BIOS dated 2013.12.16 is tagged as being “Beta BIOS” – probably only for the adventurous.
[HR][/HR]Is the trace really from the systemd Journal (journalctl)? Or, is it generated by the systemd Core Dump utility (coredumpctl)?

Thanx for your response. The traces/crash reports are all from the system log (systemd Journal) and have priority warning. This is the current count


linux-niva:~ # journalctl  -p warning | grep -i "kernel: WARNING: CPU:" | wc -l
875

Maybe a BIOS update is something to think about but I’m hoping for a patch for btrfs…

May be a flaky HDD: please check the SMART health status of your disk(s):


 # smartctl --scan
/dev/sda -d scsi # /dev/sda, SCSI device
/dev/sdb -d scsi # /dev/sdb, SCSI device
/dev/sdc -d scsi # /dev/sdc, SCSI device
 # 
 # smartctl --health /dev/sda
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.79-18.26-default] (SUSE RPM)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

 # smartctl --health /dev/sdb
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.79-18.26-default] (SUSE RPM)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

 # smartctl --health /dev/sdc
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.79-18.26-default] (SUSE RPM)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

 # 

It is almost certainly something with my hardware or setup since nobody else seems to have these crashes, they are quite “visible” if you happen to look in the log. It may be something about the disk


linux-niva:~ # smartctl --health /dev/sda
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.85-22-default] (SUSE RPM)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Please note the following marginal Attributes:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
190 Airflow_Temperature_Cel 0x0022   060   041   045    Old_age   Always   In_the_past 40 (Min/Max 21/42 #902)


I also executed the long test


linux-niva:~ # smartctl --test long /dev/sda

takes more than two hours but everything seems ok


linux-niva:~ # smartctl -l selftest /dev/sda
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.85-22-default] (SUSE RPM)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     26897         -
  

I also fiddled with the SATA connectors, after powering off of course, and there are fewer of these btrfs errors, only ten today, so I will get new cables and see if it gets better. Let’s close this matter for now and I’ll get back if things doesn’t improve with new cables. Thanx for your time.

Check fans and air flow looks like a thermal problem

I am seeing the same problem on Leap 42.2, 4.4.87-18.29-default, 65 times so far.

All my disks are clean, Smartctl reports
SMART overall-health self-assessment test result: PASSED

How do I know which filesystem this is?

I’ve just ran the btrfs cron scripts so I will see if this stops the errors.

What do these errors actually mean? Is my filesystem slowly getting corrupted?

[17231.790035] ------------ cut here ]------------
[17231.790062] WARNING: CPU: 1 PID: 510 at …/fs/btrfs/qgroup.c:2468 btrfs_qgroup_free_refroot+0x154/0x180 btrfs
[17231.790094] Modules linked in: fuse xt_CHECKSUM iptable_mangle ipt_REJECT nf_reject_ipv4 xt_tcpudp tun br_netfilter bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet iscsi_ibft iscsi_boot_sysfs it87 hwmon_vid dm_mod kvm_amd kvm irqbypass crct10dif_pclmul crc32_pclmul ext4 ghash_clmulni_intel crc16 jbd2 mbcache drbg ansi_cprng aesni_intel aes_x86_64 snd_hda_codec_realtek snd_hda_codec_generic lrw snd_hda_intel nls_iso8859_1 nls_cp437 vfat fat gf128mul snd_hda_codec glue_helper joydev edac_mce_amd fam15h_power edac_core snd_usb_audio pcspkr snd_usbmidi_lib snd_rawmidi snd_seq_device ablk_helper k10temp acpi_cpufreq cryptd snd_hda_core sp5100_tco snd_hwdep snd_pcm snd_timer snd r8169 mii processor soundcore fjes shpchp i2c_piix4 btrfs xor raid6_pq
[17231.790104] raid1 hid_generic md_mod hid_microsoft usbhid sr_mod cdrom sd_mod ata_generic nvidia_uvm(PO) ohci_pci nvidia(PO) crc32c_intel ahci libahci xhci_pci xhci_hcd pata_atiixp ehci_pci ohci_hcd libata ehci_hcd usbcore usb_common drm button sg scsi_mod efivarfs autofs4
[17231.790106] CPU: 1 PID: 510 Comm: btrfs-transacti Tainted: P W O 4.4.87-18.29-default #1
[17231.790106] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./970A-DS3P, BIOS FD 02/26/2016
[17231.790108] 0000000000000000 ffffffff8132a2a7 0000000000000000 ffffffffa0d94b2c
[17231.790109] ffffffff8107ef11 ffff88042b1ec248 0000000000001000 ffff8800bd2f4d70
[17231.790110] ffff88042b1ec1c0 ffff8800bd2f4000 ffffffffa0d7c3e4 0000000000000117
[17231.790111] Call Trace:
[17231.790120] <ffffffff81019ea9>] dump_trace+0x59/0x320
[17231.790123] <ffffffff8101a26a>] show_stack_log_lvl+0xfa/0x180
[17231.790129] <ffffffff8101b011>] show_stack+0x21/0x40
[17231.790134] <ffffffff8132a2a7>] dump_stack+0x5c/0x85
[17231.790139] <ffffffff8107ef11>] warn_slowpath_common+0x81/0xb0
[17231.790156] <ffffffffa0d7c3e4>] btrfs_qgroup_free_refroot+0x154/0x180 [btrfs]
[17231.790176] <ffffffffa0cf261e>] __btrfs_run_delayed_refs.constprop.76+0x2de/0x13c0 [btrfs]
[17231.790191] <ffffffffa0cf6798>] btrfs_run_delayed_refs+0x78/0x320 [btrfs]
[17231.790207] <ffffffffa0d0e894>] btrfs_commit_transaction+0x24/0x60 [btrfs]
[17231.790224] <ffffffffa0d0857b>] transaction_kthread+0x21b/0x280 [btrfs]
[17231.790228] <ffffffff8109dc52>] kthread+0xd2/0xf0
[17231.790233] <ffffffff81610bcf>] ret_from_fork+0x3f/0x70
[17231.791590] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70

[17231.791590] Leftover inexact backtrace:

[17231.791593] <ffffffff8109db80>] ? kthread_park+0x50/0x50
[17231.791593] — end trace 4549bb5b344d76c4 ]—
[17231.791595] BTRFS warning (device md127): qgroup 279 reserved space underflow, have: 0, to free: 4096

NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 931.5G 0 disk
├─sda1 8:1 0 1G 0 part /boot/efi
├─sda2 8:2 0 8G 0 part [SWAP]
└─sda3 8:3 0 922.5G 0 part
└─md127 9:127 0 922.5G 0 raid1 /boot/grub2/x86_64-efi
sdb 8:16 0 931.5G 0 disk
├─sdb1 8:17 0 1G 0 part /extra
├─sdb2 8:18 0 8G 0 part [SWAP]
└─sdb3 8:19 0 922.5G 0 part
└─md127 9:127 0 922.5G 0 raid1 /boot/grub2/x86_64-efi
sdc 8:32 0 1.8T 0 disk
├─sdc1 8:33 0 399M 0 part
├─sdc2 8:34 0 20G 0 part
└─sdc3 8:35 0 1.8T 0 part
└─md126 9:126 0 1.8T 0 raid1
├─Raid-home 253:0 0 300G 0 lvm /home
├─Raid-mylife 253:1 0 900G 0 lvm /mylife
├─Raid-root 253:2 0 200G 0 lvm
└─Raid-tmp 253:3 0 10G 0 lvm
sdd 8:48 0 1.8T 0 disk
├─sdd1 8:49 0 399M 0 part
├─sdd2 8:50 0 20G 0 part
└─sdd3 8:51 0 1.8T 0 part
└─md126 9:126 0 1.8T 0 raid1
├─Raid-home 253:0 0 300G 0 lvm /home
├─Raid-mylife 253:1 0 900G 0 lvm /mylife
├─Raid-root 253:2 0 200G 0 lvm
└─Raid-tmp 253:3 0 10G 0 lvm

$ mount |grep btrfs
/dev/md127 on / type btrfs (rw,relatime,space_cache,subvolid=259,subvol=/@/.snapshots/1/snapshot)
/dev/md127 on /var/tmp type btrfs (rw,relatime,space_cache,subvolid=279,subvol=/@/var/tmp)
/dev/md127 on /srv type btrfs (rw,relatime,space_cache,subvolid=264,subvol=/@/srv)
/dev/md127 on /var/lib/mysql type btrfs (rw,relatime,space_cache,subvolid=273,subvol=/@/var/lib/mysql)
/dev/md127 on /var/spool type btrfs (rw,relatime,space_cache,subvolid=278,subvol=/@/var/spool)
/dev/md127 on /var/lib/mailman type btrfs (rw,relatime,space_cache,subvolid=271,subvol=/@/var/lib/mailman)
/dev/md127 on /var/lib/machines type btrfs (rw,relatime,space_cache,subvolid=270,subvol=/@/var/lib/machines)
/dev/md127 on /home.old type btrfs (rw,relatime,space_cache,subvolid=262,subvol=/@/home)
/dev/md127 on /opt type btrfs (rw,relatime,space_cache,subvolid=263,subvol=/@/opt)
/dev/md127 on /.snapshots type btrfs (rw,relatime,space_cache,subvolid=258,subvol=/@/.snapshots)
/dev/md127 on /var/log type btrfs (rw,relatime,space_cache,subvolid=276,subvol=/@/var/log)
/dev/md127 on /var/lib/mariadb type btrfs (rw,relatime,space_cache,subvolid=272,subvol=/@/var/lib/mariadb)
/dev/md127 on /usr/local type btrfs (rw,relatime,space_cache,subvolid=266,subvol=/@/usr/local)
/dev/md127 on /var/lib/pgsql type btrfs (rw,relatime,space_cache,subvolid=275,subvol=/@/var/lib/pgsql)
/dev/md127 on /var/lib/named type btrfs (rw,relatime,space_cache,subvolid=274,subvol=/@/var/lib/named)
/dev/md127 on /var/cache type btrfs (rw,relatime,space_cache,subvolid=267,subvol=/@/var/cache)
/dev/md127 on /tmp type btrfs (rw,relatime,space_cache,subvolid=265,subvol=/@/tmp)
/dev/md127 on /var/lib/libvirt/images type btrfs (rw,relatime,space_cache,subvolid=269,subvol=/@/var/lib/libvirt/images)
/dev/md127 on /boot/grub2/i386-pc type btrfs (rw,relatime,space_cache,subvolid=260,subvol=/@/boot/grub2/i386-pc)
/dev/md127 on /var/crash type btrfs (rw,relatime,space_cache,subvolid=268,subvol=/@/var/crash)
/dev/md127 on /var/opt type btrfs (rw,relatime,space_cache,subvolid=277,subvol=/@/var/opt)
/dev/md127 on /boot/grub2/x86_64-efi type btrfs (rw,relatime,space_cache,subvolid=261,subvol=/@/boot/grub2/x86_64-efi)

cron scripts didn’t help, just got another:

[19132.347795] ------------ cut here ]------------
[19132.347861] WARNING: CPU: 3 PID: 510 at …/fs/btrfs/qgroup.c:2468 btrfs_qgroup_free_refroot+0x154/0x180 btrfs
[19132.347862] Modules linked in: fuse xt_CHECKSUM iptable_mangle ipt_REJECT nf_reject_ipv4 xt_tcpudp tun br_netfilter bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet iscsi_ibft iscsi_boot_sysfs it87 hwmon_vid dm_mod kvm_amd kvm irqbypass crct10dif_pclmul crc32_pclmul ext4 ghash_clmulni_intel crc16 jbd2 mbcache drbg ansi_cprng aesni_intel aes_x86_64 snd_hda_codec_realtek snd_hda_codec_generic lrw snd_hda_intel nls_iso8859_1 nls_cp437 vfat fat gf128mul snd_hda_codec glue_helper joydev edac_mce_amd fam15h_power edac_core snd_usb_audio pcspkr snd_usbmidi_lib snd_rawmidi snd_seq_device ablk_helper k10temp acpi_cpufreq cryptd snd_hda_core sp5100_tco snd_hwdep snd_pcm snd_timer snd r8169 mii processor soundcore fjes shpchp i2c_piix4 btrfs xor raid6_pq
[19132.347967] raid1 hid_generic md_mod hid_microsoft usbhid sr_mod cdrom sd_mod ata_generic nvidia_uvm(PO) ohci_pci nvidia(PO) crc32c_intel ahci libahci xhci_pci xhci_hcd pata_atiixp ehci_pci ohci_hcd libata ehci_hcd usbcore usb_common drm button sg scsi_mod efivarfs autofs4
[19132.347998] CPU: 3 PID: 510 Comm: btrfs-transacti Tainted: P W O 4.4.87-18.29-default #1
[19132.348003] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./970A-DS3P, BIOS FD 02/26/2016
[19132.348005] 0000000000000000 ffffffff8132a2a7 0000000000000000 ffffffffa0d94b2c
[19132.348011] ffffffff8107ef11 ffff88042b1ec248 0000000000018000 ffff8800bd2f4d70
[19132.348016] ffff88042b1ec1c0
[19132.348018] ffff8800bd2f4000 ffffffffa0d7c3e4 0000000000000117
[19132.348022] Call Trace:
[19132.348042] <ffffffff81019ea9>] dump_trace+0x59/0x320
[19132.348055] <ffffffff8101a26a>] show_stack_log_lvl+0xfa/0x180
[19132.348062] <ffffffff8101b011>] show_stack+0x21/0x40
[19132.348072] <ffffffff8132a2a7>] dump_stack+0x5c/0x85
[19132.348080] <ffffffff8107ef11>] warn_slowpath_common+0x81/0xb0
[19132.348129] <ffffffffa0d7c3e4>] btrfs_qgroup_free_refroot+0x154/0x180 [btrfs]
[19132.348182] <ffffffffa0d0b8ae>] commit_fs_roots.isra.20+0x14e/0x190 [btrfs]
[19132.348234] <ffffffffa0d0e238>] btrfs_commit_transaction.part.26+0x498/0xad0 [btrfs]
[19132.348279] <ffffffffa0d0857b>] transaction_kthread+0x21b/0x280 [btrfs]
[19132.348290] <ffffffff8109dc52>] kthread+0xd2/0xf0
[19132.348298] <ffffffff81610bcf>] ret_from_fork+0x3f/0x70
[19132.353164] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70

[19132.353166] Leftover inexact backtrace:

[19132.353172] <ffffffff8109db80>] ? kthread_park+0x50/0x50
[19132.353198] — end trace 4549bb5b344d76c5 ]—
[19132.353202] BTRFS warning (device md127): qgroup 279 reserved space underflow, have: 94208, to free: 98304

Well, I tried to muck around a bit…

  • Changed the SATA cables and while the lid was open I rearranged the hard drives, old mechanical ones, so there is several centimeters of air between them, 7-8 cm or so, and I’m quite sure they should not produce that much heat
  • All fans are working
  • Left the lid open so there should be enough circulation
  • Checked the fle system by booting to rescue system and running, on unmounted fs,
# btrfs check /dev/sda2

There were no errors so the file system is supposedly fine then.

Did any of the above help? No!!!

The crash warnings still appear in the log. The accumulated count is now 2851 of these btrfs crash warnings. Still, nothing really happens, the machine runs along and I can keep on using it as usual. It’s just a bit disturbing with these messages…

I have installed lm_sensors to keep a closer look. The temperature and disk activity is normally low, especially disk I/O, so I it is hard to connect the messages with the disk falling apart or the processor frying to pieces or anything really…

One thing though, I’m running BOINC (distributed number crunching etc) and I’m allowing it to use all eight processor cores. When I increase the percentage of processor time BOINC is allowed to use the btrfs crashes start to come more often, and the processor temperature increases also of course. I have no numbers on exactly how many messages appear at a given percentage of processor time but it is very easy to see the difference. The processor temperature stays at around 60 C as max, according to sensors, I don’t give BOINC any more percentage. Just now BOINC has 30% processor use and the temperature is 40-45 C but the btrfs messages still appear but not very often, 7-8 today so far.

What do you think about that?

If you mean “qgroup 259 reserved space underflow” - these are not “crash”, they are overeager debug messages that are removed (moved under debug build actually) in current kernel. So you can ignore them.

I have not seen any other “crashes” on this thread.

On the Berkley BOINC download URL <https://boinc.berkeley.edu/download.php/&gt; the mention:

Tested on the current Ubuntu distribution; may work on others.
If available, we recommend that you install a distribution-specific package instead.

And, on the distribution-specific URL <Installing BOINC - BOINC, openSUSE isn’t mentioned.

On the other hand, doing a “zypper search” for BOINC revealed that, there are ‘boinc-client’ and ‘boinc-manager’ packages in the main openSUSE Leap 42.3 OSS repository, as well as a ‘libboinc7’ package.

  • Did you install your BOINC executables from the openSUSE Leap 42.3 OSS repository?

Ok, I’m not quite sure what “overeager debug messages” are but let’s assume the error messages are logged with priority “warning” when they should be priority “debug”, which is a bug in itself, and they are going to be fixed in some future kernel. Fine, so they should be ignored.

I have installed BOINC from the openSUSE Leap 42.3 OSS repository. BOINC can easily put the processor under heavy load and for some reason that load triggers these btrfs messages. Since they are not warnings even if it seems so let’s just forget this. False alarm! Thanx for your time.

Therefore, it may pay to, given that, BOINC is supposed to be a background task offering system resources to the Internet community, to adopt the same strategy as the cast that, one or more Virtual Machines are running on the system:

  • Attempt to configure BOINC such that, it consumes at most only, say, for example, 50 % of the system’s resources.