btrfs skinny extents unable to fix. What can I do?

Hello

My OpenSuse Leap 42.2 does not boot anymore
Problem was a power loss by shutdown I think.
I had has skinny extents and space cache generation (…) does not match inode (…) errors.

I’ve already tried to following stuff with btfs utils from 42.2

btrfsck /dev/sdc18
btrfs rescue zero-log /dev/sdc18
btrfs check --repair

Now I have downloaded leap15 live system to get output
Mounting /dev/sdc18 from live system hangs with the following dmesg output:


 5746.091900] BTRFS info (device sdc18): disk space caching is enabled
 5746.091903] BTRFS info (device sdc18): has skinny extents
 5746.701263] BTRFS error (device sdc18): space cache generation (510509) does not match inode (510512)
 5746.701271] BTRFS warning (device sdc18): failed to load free space cache for block group 2176843776, rebuilding it now
 5747.850762] BTRFS: Transaction aborted (error -117)
 5747.850800] ------------ cut here ]------------
 5747.850842] WARNING: CPU: 1 PID: 3489 at ../fs/btrfs/extent-tree.c:6995 __btrfs_free_extent.isra.64+0xb9d/0xd40 [btrfs]
 5747.850843] Modules linked in: af_packet xt_tcpudp ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables msr raid1 md_mod snd_hda_codec_via snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer intel_powerclamp snd iTCO_wdt joydev gpio_ich coretemp iTCO_vendor_support soundcore kvm_intel shpchp i2c_i801 lpc_ich i7core_edac pcspkr kvm asus_atk0110 acpi_cpufreq irqbypass overlay
 5747.850906]  nls_utf8 isofs squashfs btrfs xor raid6_pq sr_mod cdrom hid_generic usbhid uas usb_storage ata_generic pata_acpi amdkfd pata_marvell amd_iommu_v2 firewire_ohci radeon i2c_algo_bit crc32c_intel 8139too serio_raw 8139cp ata_piix firewire_core crc_itu_t r8169 mii ahci libahci scsi_transport_iscsi pata_jmicron xhci_pci drm_kms_helper syscopyarea sysfillrect sysimgblt ehci_pci fb_sys_fops xhci_hcd ehci_hcd ttm usbcore drm drm_panel_orientation_quirks button sunrpc dm_mirror dm_region_hash dm_log loop sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua
 5747.850947] CPU: 1 PID: 3489 Comm: mount Not tainted 4.12.14-lp150.12.16-default #1 openSUSE Leap 15.0
 5747.850948] Hardware name: System manufacturer System Product Name/P7P55D-E, BIOS 1504    12/14/2010
 5747.850949] task: ffff8803480d2080 task.stack: ffffc900018c0000
 5747.850959] RIP: 0010:__btrfs_free_extent.isra.64+0xb9d/0xd40 [btrfs]
 5747.850960] RSP: 0018:ffffc900018c36f8 EFLAGS: 00010292
 5747.850961] RAX: 0000000000000027 RBX: 0000000000000000 RCX: 0000000000000000
 5747.850962] RDX: ffff88040f25fd40 RSI: ffff88040f257a68 RDI: ffff88040f257a68
 5747.850963] RBP: 00000002b7b14000 R08: 000000000000043a R09: 0000000000000001
 5747.850963] R10: ffff88035ca5be00 R11: 0000000000000001 R12: ffff88040c748000
 5747.850964] R13: 00000000ffffff8b R14: ffff880379ba98e8 R15: ffff8804020a7bd0
 5747.850965] FS:  00007f0884392fc0(0000) GS:ffff88040f240000(0000) knlGS:0000000000000000
 5747.850966] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 5747.850967] CR2: 00005603a52c1c28 CR3: 0000000366302000 CR4: 00000000000006e0
 5747.850967] Call Trace:
 5747.850980]  ? block_group_cache_tree_search+0x22/0xd0 [btrfs]
 5747.850989]  ? update_block_group.isra.63+0x142/0x3f0 [btrfs]
 5747.851002]  ? btrfs_merge_delayed_refs+0x62/0x4f0 [btrfs]
 5747.851012]  __btrfs_run_delayed_refs+0x5b9/0x1300 [btrfs]
 5747.851023]  btrfs_run_delayed_refs+0x68/0x250 [btrfs]
 5747.851034]  btrfs_write_dirty_block_groups+0x146/0x360 [btrfs]
 5747.851045]  commit_cowonly_roots+0x220/0x2c0 [btrfs]
 5747.851057]  btrfs_commit_transaction+0x389/0x900 [btrfs]
 5747.851071]  btrfs_recover_log_trees+0x3c4/0x440 [btrfs]
 5747.851082]  ? btree_read_extent_buffer_pages+0xca/0x1f0 [btrfs]
 5747.851095]  ? replay_one_extent+0x720/0x720 [btrfs]
 5747.851106]  open_ctree+0x238f/0x2480 [btrfs]
 5747.851115]  btrfs_mount+0xdd0/0xeb0 [btrfs]
 5747.851119]  ? pcpu_next_unpop+0x3b/0x50
 5747.851120]  ? pcpu_alloc+0x242/0x650
 5747.851122]  mount_fs+0x35/0x150
 5747.851124]  vfs_kern_mount.part.20+0x54/0x100
 5747.851133]  btrfs_mount+0x18a/0xeb0 [btrfs]
 5747.851135]  ? pcpu_next_unpop+0x3b/0x50
 5747.851136]  ? pcpu_alloc+0x242/0x650
 5747.851137]  mount_fs+0x35/0x150
 5747.851139]  vfs_kern_mount.part.20+0x54/0x100
 5747.851140]  do_mount+0x512/0xc30
 5747.851142]  ? memdup_user+0x3e/0x70
 5747.851143]  SyS_mount+0x80/0xd0
 5747.851145]  do_syscall_64+0x7b/0x150
 5747.851148]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
 5747.851149] RIP: 0033:0x7f0883c6e19a
 5747.851149] RSP: 002b:00007ffcea57e518 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
 5747.851151] RAX: ffffffffffffffda RBX: 00005626551c0170 RCX: 00007f0883c6e19a
 5747.851151] RDX: 00005626551ced10 RSI: 00005626551c0430 RDI: 00005626551c0350
 5747.851152] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000004
 5747.851153] R10: 00000000c0ed0000 R11: 0000000000000246 R12: 00005626551c0350
 5747.851153] R13: 00005626551ced10 R14: 0000000000000000 R15: 00007f08841851c4
 5747.851154] Code: 00 00 48 c7 c6 c0 c7 80 a0 4c 89 f7 41 bd ea ff ff ff e8 ad d0 09 00 e9 a0 f5 ff ff 44 89 ee 48 c7 c7 30 31 81 a0 e8 89 d6 a3 e0 <0f> 0b e9 73 f5 ff ff 49 8b 46 60 f0 0f ba a8 30 17 00 00 02 72
 5747.851172] --- end trace f04980e395b08e75 ]---
 5747.851174] BTRFS: error (device sdc18) in __btrfs_free_extent:6995: errno=-117 unknown
 5747.851176] BTRFS: error (device sdc18) in btrfs_run_delayed_refs:3016: errno=-117 unknown
 5747.851192] BTRFS warning (device sdc18): Skipping commit of aborted transaction.
 5747.851193] BTRFS: error (device sdc18) in cleanup_transaction:1876: errno=-117 unknown
 5747.851315] BTRFS: error (device sdc18) in btrfs_replay_log:2545: errno=-117 unknown (Failed to recover log tree)
 5747.851521] BTRFS error (device sdc18): cleaner transaction attach returned -30
 5747.859941] BUG: unable to handle kernel NULL pointer dereference at 0000000000000024
 5747.860025] IP: btrfs_search_slot+0xd5/0xa30 [btrfs]
 5747.860071] PGD 0 P4D 0
 5747.860119] Oops: 0002 #1] SMP PTI
 5747.860144] Modules linked in: af_packet xt_tcpudp ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables msr raid1 md_mod snd_hda_codec_via snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer intel_powerclamp snd iTCO_wdt joydev gpio_ich coretemp iTCO_vendor_support soundcore kvm_intel shpchp i2c_i801 lpc_ich i7core_edac pcspkr kvm asus_atk0110 acpi_cpufreq irqbypass overlay
 5747.860239]  nls_utf8 isofs squashfs btrfs xor raid6_pq sr_mod cdrom hid_generic usbhid uas usb_storage ata_generic pata_acpi amdkfd pata_marvell amd_iommu_v2 firewire_ohci radeon i2c_algo_bit crc32c_intel 8139too serio_raw 8139cp ata_piix firewire_core crc_itu_t r8169 mii ahci libahci scsi_transport_iscsi pata_jmicron xhci_pci drm_kms_helper syscopyarea sysfillrect sysimgblt ehci_pci fb_sys_fops xhci_hcd ehci_hcd ttm usbcore drm drm_panel_orientation_quirks button sunrpc dm_mirror dm_region_hash dm_log loop sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua
 5747.860309] CPU: 1 PID: 3095 Comm: kworker/u32:1 Tainted: G        W        4.12.14-lp150.12.16-default #1 openSUSE Leap 15.0
 5747.860329] Hardware name: System manufacturer System Product Name/P7P55D-E, BIOS 1504    12/14/2010
 5747.860364] Workqueue: btrfs-cache btrfs_cache_helper [btrfs]
 5747.860383] task: ffff8802ff09a000 task.stack: ffffc90008874000
 5747.860411] RIP: 0010:btrfs_search_slot+0xd5/0xa30 [btrfs]
 5747.860430] RSP: 0018:ffffc90008877c78 EFLAGS: 00010246
 5747.860449] RAX: 0000000000000000 RBX: ffff8804020a7770 RCX: ffff8804020a7770
 5747.860468] RDX: ffffc90008877d47 RSI: ffff88034813a800 RDI: 0000000000000000
 5747.860487] RBP: 000000000000012d R08: 0000000000000000 R09: 0000000000000000
 5747.860506] R10: ffff8802f9ed3f18 R11: ffff880000000000 R12: ffff880000000000
 5747.860525] R13: ffffc90008877d47 R14: 0000000000000000 R15: ffff88034813a800
 5747.860544] FS:  0000000000000000(0000) GS:ffff88040f240000(0000) knlGS:0000000000000000
 5747.860564] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 5747.860583] CR2: 0000000000000024 CR3: 000000000200a000 CR4: 00000000000006e0
 5747.860601] Call Trace:
 5747.860631]  ? read_block_for_search.isra.35+0x189/0x350 [btrfs]
 5747.860660]  btrfs_next_old_leaf+0xe8/0x480 [btrfs]
 5747.860689]  caching_thread+0x2c8/0x490 [btrfs]
 5747.860722]  btrfs_worker_helper+0x81/0x300 [btrfs]
 5747.860743]  process_one_work+0x1da/0x3f0
 5747.860763]  worker_thread+0x2b/0x3f0
 5747.860783]  ? process_one_work+0x3f0/0x3f0
 5747.860802]  kthread+0x11a/0x130
 5747.860821]  ? kthread_create_on_node+0x40/0x40
 5747.860841]  ret_from_fork+0x35/0x40
 5747.860859] Code: 48 89 cb 49 89 d5 49 89 f7 48 89 7c 24 10 0f b6 43 6a a8 10 0f 84 a2 04 00 00 a8 20 0f 85 6f 07 00 00 49 8b 47 08 48 89 44 24 48 <f0> ff 40 24 48 8b 44 24 48 48 ba 00 00 00 00 00 16 00 00 48 b9
 5747.860910] RIP: btrfs_search_slot+0xd5/0xa30 [btrfs] RSP: ffffc90008877c78
 5747.860929] CR2: 0000000000000024
 5747.872347] --- end trace f04980e395b08e76 ]---

btrfsck /dev/sdc18 gives the following output:


Checking filesystem on /dev/sdc18
UUID: 5f51d84f-9c5e-4751-b0dd-93b384cea9b0
The following tree block(s) is corrupted in tree 259:
        tree block bytenr: 67238363136, level: 0, node key: (11669569536, 169, 0)
The following tree block(s) is corrupted in tree 264:
        tree block bytenr: 67238363136, level: 0, node key: (11669569536, 169, 0)
The following tree block(s) is corrupted in tree 357:
        tree block bytenr: 67238363136, level: 0, node key: (11669569536, 169, 0)
The following tree block(s) is corrupted in tree 507:
        tree block bytenr: 67238363136, level: 0, node key: (11669569536, 169, 0)
The following tree block(s) is corrupted in tree 508:
        tree block bytenr: 67238363136, level: 0, node key: (11669569536, 169, 0)
The following tree block(s) is corrupted in tree 561:
        tree block bytenr: 67238363136, level: 0, node key: (11669569536, 169, 0)
The following tree block(s) is corrupted in tree 568:
        tree block bytenr: 67238363136, level: 0, node key: (11669569536, 169, 0)
The following tree block(s) is corrupted in tree 615:
        tree block bytenr: 67238363136, level: 0, node key: (11669569536, 169, 0)
The following tree block(s) is corrupted in tree 616:
        tree block bytenr: 67238363136, level: 0, node key: (11669569536, 169, 0)
The following tree block(s) is corrupted in tree 617:
        tree block bytenr: 67238363136, level: 0, node key: (11669569536, 169, 0)
The following tree block(s) is corrupted in tree 618:
        tree block bytenr: 67238363136, level: 0, node key: (11669569536, 169, 0)
The following tree block(s) is corrupted in tree 624:
        tree block bytenr: 67238363136, level: 0, node key: (11669569536, 169, 0)
The following tree block(s) is corrupted in tree 625:
        tree block bytenr: 67238363136, level: 0, node key: (11669569536, 169, 0)
The following tree block(s) is corrupted in tree 626:
        tree block bytenr: 67238363136, level: 0, node key: (11669569536, 169, 0)
The following tree block(s) is corrupted in tree 627:
        tree block bytenr: 67238363136, level: 0, node key: (11669569536, 169, 0)
The following tree block(s) is corrupted in tree 628:
        tree block bytenr: 67238363136, level: 0, node key: (11669569536, 169, 0)
The following tree block(s) is corrupted in tree 629:
        tree block bytenr: 67238363136, level: 0, node key: (11669569536, 169, 0)
The following tree block(s) is corrupted in tree 640:
        tree block bytenr: 67238363136, level: 0, node key: (11669569536, 169, 0)
The following tree block(s) is corrupted in tree 641:
        tree block bytenr: 67238363136, level: 0, node key: (11669569536, 169, 0)
The following tree block(s) is corrupted in tree 642:
        tree block bytenr: 67238363136, level: 0, node key: (11669569536, 169, 0)
The following tree block(s) is corrupted in tree 643:
        tree block bytenr: 67238363136, level: 0, node key: (11669569536, 169, 0)
The following tree block(s) is corrupted in tree 644:
        tree block bytenr: 67238363136, level: 0, node key: (11669569536, 169, 0)
found 31901081600 bytes used, error(s) found
total csum bytes: 24516844
total tree bytes: 1110753280
total fs tree bytes: 1010794496
total extent tree bytes: 66650112
btree space waste bytes: 196159090
file data blocks allocated: 158332096512
 referenced 76513964032

What should I do next? Again a btrfs rescue zero-log /dev/sdc18?
Colud I restore a btrfs snapshot of / and this could solve the problem?

Thanks for any hints!!

Well, you just reduced your chances to repair this filesystem. These should never be used unless explicitly recommended by developers who understand what is corrupted. You should have posted to btrfs list before doing anything.

What should I do next? Again a btrfs rescue zero-log /dev/sdc18?

No.

Colud I restore a btrfs snapshot of / and this could solve the problem?

To restore snapshot you need to mount filesystem first.

At this point you should attempt to preserve any data that you may need using “btrfs restore” (try using as recent tool version as possible, e.g. by using Tumbleweed) while contacting btrfs mailing list for guidance. They can analyse the extent of damage and guide you how to fix it.

P.S. barring kernel bugs, this may indicate that your hard disk lies pretending it committed data to stable storage. It is not something unheard of. What exact HDD you have?

WD Caviar red 4TB

No problem for me. I will installed leap 15 now (I wanted to upgrade time ago) and will use ext4 again.
I had never but never problems with ext3/ext4 in 20 years, now I have given btrfs a try, but this case shows me that’s
really not save to use.
I have my personal system leap 15 with btrfs insalled but will now migrate to ext4 too.

Thanks for answer

Ok you convenced me to try the list
I have still data which I need :frowning: