Btrfs System Slows to Crawl with Insatiable Drive Thrashing

I’d love to know what is causing this - I’ve about exhausted everything I can think of. I use btrfs on my notebook with an SSD drive and it is fantastic. On my desktop, it is so bad the system is hardly usable.

I’m highly suspicious of the WD WD10EARS 4K drive, though really I see nothing that would specifically be causing this issue.

OpenSuse 12.1, kernel 3.4.0 64-bit (kernel.org source compiled via SAKC), btrfsprogs-0.19-43.10.1.x86_64

When originally installed OpenSuse 12.1 with root btrfs there were no I/O performance issues. Over time performance has gotten worse, to the point of system becoming entirely unresponsive for minutes at a time.

At random times even a copy / paste will freeze the machine for upwards of a minute. There is no relation to any particular application - happens with any application, or none at all (just on login)
Even under light use, I/O is wretched at best. Simple operations (opening kate, etc.) can far too long. iotop will show the app using 99% of io - for minutes - only to open a blank document, etc.

The last few days dropbox now start to sync but never finishes - just runs forever with meager I/O rates, but never ends. (There is not even that much new data to sync, I don’t even think it gets through the initial index phase.)

Sometimes, after a bit of use, all the btrfs-endio, btrfs-flush and btrfs-transaction, etc. kernel threads finally end and you can work for awhile. For a while. Rebooting does not help much.

Heaven forbid nepomuk, tracker-store, etc. try to run - it would take hours. (nepomuk is now disabled.)

Setting swapiness (on the thought that things like updatedb running via cron.daily would throw code pages out for disk cache, perhaps causing it to bog down when reloading libs, etc.) did not help at all.
(Swap usage is typically very low nearly at all times.)

# free -m
             total       used       free     shared    buffers     cached
Mem:          3952       3500        452          0          0        771
-/+ buffers/cache:       2728       1223
Swap:         4094         26       4068

Snapper has been disabled - if snapper runs compdirs the systems becomes completely stalled with disk I/0 to point of freezing completely.

There is no issue with the drive itself - no smart error, hdparm -tT, iozone, bonnie++ all show fine raw performance, etc.


/dev/sdb:
 Timing cached reads:   16936 MB in  2.00 seconds = 8475.93 MB/sec
 Timing buffered disk reads: 318 MB in  3.01 seconds = 105.74 MB/sec

This is a 4K (512K emulated drive) and partitions are aligned:



(/dev/sdb3 is root btrfs)
# fdisk -l /dev/sdb  


Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0008916e


   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1            2048     1028095      513024   83  Linux
/dev/sdb2         1028096     9414655     4193280   82  Linux swap / Solaris
/dev/sdb3         9414656  1743808511   867196928   83  Linux
/dev/sdb4      1743808512  1953523711   104857600   83  Linux

#hdparm -I /dev/sdb
 
        Model Number:       WDC WD10EARS-00Y5B1                     
        Serial Number:      WD-WMAV51843412
        Firmware Revision:  80.00A80



No LVM volumes on this drive, and no encrypted partitions.

At this point it has gotten so bad that a re-install is looking pretty good - this poor i7 desktop runs like its a 486. Maybe an update to 12.2 might fix it, but I’m pretty doubtful.

I would really like to know what is the real culprit here and would appreciate any insight others might be able to share.

Cheers,

May be bad sectors. Do you see lots of disk activity when a slow down happens?

Run a low level disk scan. Should be able to get software from the disk maker’s site.

No problem with the disk itself, no smart errors, no reallocated sectors, no bad performance if running benchmarks from a liveCD, etc. The drive is healthy and performance is fine and verified by I/O benchmarks etc.

Further observation: I compiled a new 3.5.3 kernel and rebooted to run level 3 to compile the Nvidia driver. This took over :15 - yikes. (normally takes 1-2 minutes). The compile stage only took a few minutes (<3), however the steps where it searches for existing libs, or installs them, of looks for Xorg config, etc took painfully long times. During this time there were no other processes running at all - I was in init 3 and top and iostat showed only the Nvidia install related processes running.

Very strange indeed.

I know there are issues with 4K sector drives, but this is pretty extreme. And it has gradually gotten worse over time. (Partitions are aligned.)

I’m considering just dd’ing to another 1TB drive just to see, but I don’t really have another one handy.

Problem is there are few if any tools to fix a btrfs file system if it goes bad. IMHO it is not ready for prime time.

On 08/28/2012 05:26 PM, gogalthorp wrote:
>
> Problem is there are few if any tools to fix a btrfs file system if it
> goes bad. IMHO it is not ready for prime time.

I agree. That file system never gets used on any of my systems.

While I agree btrfs is not necessarily prime time, I also note I have no issues whatsoever on other systems with btrfs. In fact, I’ve found snaphots and other features to be very nice indeed.

I can appreciate “it’s my own fault” - but really, I’m just trying to use it as an opportunity to learn more, not only of btrfs, but of tools to diagnose such a problem. So far, it seems oddly difficult to ascertain the true cause and a generalization of “it’s just btrfs” is a bit less than I was hoping for.

With Fedora likely to use btrfs soon, and honestly others to follow not long after, it would be nice to understand it better.

If nothing else, I find it a intriguing issue - a puzzle to be solved.

If anyone else may have some thoughts, perhaps a route may present itself.

Thank you all for your time.

Sometimes the top command can report if there are any io/btrfs related processes waiting for cpu time. Perhaps when you notice this performance problem it might give you an idea where it is bottlenecking. Though personally I agree that it is not ready for prime time.

Thanks nightwishfan, I have monitored top, iotop and dstat. There the various btrfs-endio, btrfs-transaction, btrfs-endio-met and btrfs-flush threads show high I/O and sometime high wait status, but that is all. The question then is yes, they are running (as they should be) but why, good grief, do they take 10 times as long to perform an operation as they should?

Perhaps this is related to fragmentation:

Answer : Horrible btrfs performance due to fragmentation

I’ll try running btrfs fi defragment -v -f on subvols and see what that might do.

I’ve been running btrfs since 12.1 came out and it has been stable. The only problem I have had is a full volume due to snapshots. Have you checked the free space on the partitions? A full drive or over 90% can cause a lot of thrashing. I noticed you shut off snapper but old files can still exist in the snapshot sub volumes.

Did you try the btrfs mailing list?
https://btrfs.wiki.kernel.org/index.php/Btrfs_mailing_list

I am trying to research my own problem but don’t know what a Btrfs is. My SUSE 12.1 system has taken to “insatiable drive thrashing” just like he says. Started about a week ago a minute at a time, what I am working on does not seem to matter, it seems to start at random.

Now I am writing this on Windoze because SUSE is still thrashing across the room, been doing that about 10 minutes. I can’t work with this.

Thanks if you can shed any light for an advanced beginner!

So SUSE has stopped thrashing and I am at a green SUSE login screen which I don’t think I have ever seen? It never asks me for a login. I typed my pw. Now I am back at the desktop?? Was that some kind of stealth upgrade? Disk activity has stopped.

So, I am subscribed to this thread and interested in any discussion. I have no reason to think it won’t happen again.

agree, Try run a low level disk scan

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

georgeinaction,

Find out if you are using btrfs by posting the output from the ‘mount |
grep btrfs’ command here. It should have something like this:


/dev/mapper/system-root on / type btrfs (rw,noatime,ssd,space_cache)
/dev/mapper/system-home on /home type btrfs (rw,noatime,ssd,space_cache)

If you get zero lines back from the command then you’re not using btrfs
and so this does not apply to you.

To add to what LewsTherinTelemon stated, I too am seeing nearly
identical issues on my laptop (soon to be my old laptop). It’s a Dell
E6410 with a non-SSD drive… probably around 200 GB. I was foolish
when I set it up and didn’t go for the partitioning recommendations of
OpenSuSE 12.1 (x86_64) and instead went with something more
traditional… /boot volume (tiny, ext2), swap (2GB) and / (everything
else). Also I did use LVM and did implement full-disk (except for
/boot) encryption during setup, so that was all working beautifully.
Despite this my performance has steadily degraded from just fine to
horrible. My bootup times in minutes, my login times in minutes, and
basically I’ve come to the conclusion that my only sanity comes through
8 GB RAM caching my disk. Performance is awesome unless the disk is
accessed and I’m worried that it’s because of the snapshots taking place
whenever zypper is used to do anything with software (add/remove).

Like LewsTherinTelemon I have no errors anywhere, but top often shows
the various kernel btrfs processes going crazy. Immediately after an
install or removal of software my latest nemesis, compare_dirs, runs and
causes my hard drive light to stay active forever. I believe I even
tested once and did installs back-to-back to see if things improved and
they did no. I’ve used OpenSUSE for a while with ext4 and SLED with XFS
and have never seen I/O go this way; it was at least consistent from the
start. At one point I disabled the automatic time-based snapshots, but
I never disabled those from Yast, but the damage is done. I think I
could probably go through and delete all old snapshots and see if that
helps, but it would be a shame to lose an environment that took nine
months to cause this problem. I could probably find a system to which I
could ‘dd’ everything, but I’d rather do better troubleshooting than
simply rebuilding and seeing if things are better.

Maybe the only way is to hook strace up to the compare_dirs or other
btrfs* processes, but that’s never fun.

Good luck (to all of us).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/

iQIcBAEBAgAGBQJQPr10AAoJEF+XTK08PnB52DoP/il8HDpZw4K+nnnL+V9iZOxm
ALlP4VEpYtHZivffle0fodiis1rMJH0DfhPnkidZETG6Cd5olr8QwGq2lPwr5UI/
1MN0CzK+rpU+3OQY6D43/C0M/cdlSM7YFkpXIJnBGz3MDa8KD7euJu3BFqoAOuK/
D/Pe5k2oGOA/V5ctSbVaEztN5ojgRtIcAY/qjPlyqWGdDA+qz7g9udkNYUeq1rd7
FwVQjBTAh/ITQS0yHNqHWWZRkEN/dLpRXnutl/NJvYAFNsqNgS+CeYiem+XZKbyY
D0q8ertR47eD/tjLtzlADtO2KZUf2OIcl6bZksaX0POaEo76535ZsjrUMwUm1Ws1
hLXIRnwLrpgo0zZ2wKwBy4S1CrQOadwc0ZDpVLnA1YlGx8zNBjXDNc9AESwBWZLH
/7JiwoJ+kxwvtuAbZw0oJgRMXptHtBu/25s3akcIyw1/hKKbacYohoTu9qv06Fq1
T9sV/OU3gauFntcA8mC70OFXlFHe4+bW2c5rDAhoHPSoTG2t6xZsS8geO3qKcbQ8
QZ3Lx7N4pvg1bSO8EGS6SHDizZkao7oXTEPK+ZqQpHwIQkNGRwpgHBliIEBVLUOc
JnmefX++HY2bpl27PG6ID9+une9GB6PAe9SrSMNIo4veIq129zNtZm636ShMtMiu
YvqZr7Wytv2M6CPZMor6
=9nIm
-----END PGP SIGNATURE-----

On 08/30/2012 02:46 AM, georgeinacton wrote:

> So, I am subscribed to this thread and interested in any discussion. I
> have no reason to think it won’t happen again.

no, there are no “stealth upgrades” in a default install…however, your
administrator (is that you?) may have set up YaST Online Update to
automatically check for and install upgrades without your intervention
being required…

but, it is unlikely that accidentally got set up by you…

whether you are using btrfs (it is a new type of file system, not as
stable as the older types) or not does not matter–it is highly unlikely
your ‘problem’ is exactly the same as the one in this thread…and, even
if it was we (here who try to help, often) prefer each user with a
problem begin his/her own thread–like, even the differences in your
hardware vs the OPs is enough to warrant YOUR own thread…

so, please begin your own…give info about your hardware/software
(like: amount of RAM, using RAID, LVM, other operating systems,
etc)…use a descriptive subject (you want to attrack folks with
knowledge about your problem)…detail what you saw/see…


dd

Thank you. I get zero lines back so I am not using Btrfs (but I did read about it!)

So I have to start a new thread? My symptoms are the same.

Right now of course, it’s working perfectly!!!

Excellent advise - thank you. I will do that.

On Wed, 29 Aug 2012 00:46:02 GMT, LewsTherinTelemon
<LewsTherinTelemon@no-mx.forums.opensuse.org> wrote:

>
>While I agree btrfs is not necessarily prime time, I also note I have no
>issues whatsoever on other systems with btrfs. In fact, I’ve found
>snaphots and other features to be very nice indeed.
>
>I can appreciate “it’s my own fault” - but really, I’m just trying to
>use it as an opportunity to learn more, not only of btrfs, but of tools
>to diagnose such a problem. So far, it seems oddly difficult to
>ascertain the true cause and a generalization of “it’s just btrfs” is a
>bit less than I was hoping for.
>
>With Fedora likely to use btrfs soon, and honestly others to follow not
>long after, it would be nice to understand it better.
>
>If nothing else, I find it a intriguing issue - a puzzle to be solved.
>
>If anyone else may have some thoughts, perhaps a route may present
>itself.
>
>Thank you all for your time.

Do you have beagle or any of its relatives on your system? If so try
deleting / disabling as much as you can without breaking your system, it
is fairly easy to get rid of most of it.

?-)

Hi ab,

That sounds very similar. I suggest completely disabling snapper - those compdir processes will ruin your day as you have discovered. I just make a btrfs snapshot manually when I want - just as good in my option, and the snapper restore is rsync based which makes no sense to me. I guess for restoring a single file, but for a full roll back that is insane, why not just mount the subvolid? Anyway . . .

You might consider adjusting swapiness too - might help prevent pages getting thrown out when something like updatedb runs when you are away form the computer and then get the pleasure of waiting for :10 when you dare to come back and use it again, requiring pages to be reloaded. Ug. This did seem to help me at least a bit.

# grep swap /etc/sysctl.conf
vm.swappiness = 10

I did do a defrag and this also seemed to help:

 time btrfs filesystem defrag   /

I wish it reported statistics on what was defragged exactly.

After that I attempted a rebalance. Even though this is a single device, after reading up on it I thought rewriting all the data might help.

 btrfs filesystem balance / 

I would be very cautious of that one. Perhaps due to the already degraded state, but that ran for over 18 hours, hung up completely and the system would thereafter hang very early in the boot process - even prior to init! Wow. Finally, it was able to boot and I killed the balance operation (which took a loooong time).

I need to use it some more to see how things are now after the defrag, and I’m going to delete all but one or two older snapshots, and then maybe try another defrag and if I’m feeling brave another rebalance.

Things do seem better, so perhaps there is hope yet of having a decent system once again.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Not sure about LewsTherinTelemon (he wrote about nepouk, so I assume he
knows about these things), but I do not have any indexing services on my
machines.

Good luck.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/

iQIcBAEBAgAGBQJQP8odAAoJEF+XTK08PnB5JhAP/iiFXK8LrJxZJtAgBkAAf3Wu
cAIF1f3eC/M4GFwN6VZJ7dH3U3zBO8gRw7/XV2v3iRAouOBuBP/68/vaMLo3f5Sm
Mxp8QYC6ja8Yuk8qLQTGyyU+r4H2L2XuWkl9iYNAovfh6Cp9EmNH5dFtDX+iBi0g
o6IRs57BY4gBlFzOJVP5MEt5XMcyj4pCUm1qktqFPmfyrcWWShqXwMMJYQCSUpza
UqdbTx5dnRJQ6Qjz7N3WENTMtVRZgcEle5o2l3xyoeFKtpqjUQlsfsNwbS16pmUF
V9beaUlT+hQSOFyjZgb1p8Smw5UPFe19wT0XLJKZGRhHLGyOV3Q4wQe2l5o1Fpgw
iH06SuQLkjalI9g1PRJ3QeF1EOh6j7gLHzN84PMFMDcMxm2AHWzVfoD4fLTUMs+J
e94Jefy3XId1fDu+lxQItvLuGZiCfEHlOU/TZ35IslMYBN3MYKffKFVuALTvusBm
Sfgrr9O3WZ+f+hmtyi1s3wnpiD7VHK7ss9ZgllMveeWsOEwdmnqLpnrOcW+wyHMq
O6+axnXTiac5sJzO0pWoObVH3g3u+iEMCaUuLlP3DiiUrfDTfTWxhcGt2429cMSv
CnI1+oE74rPXizyM7sFdv7XclHh6uVFqsvlMoGC/0jcYN/faOsr0AFOo5t9qHH0f
biQHyPBvyfh4LJa1Flcb
=sX2c
-----END PGP SIGNATURE-----