-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
For a while now I’ve been enjoying my upgrade to OpenSUSE 12.1 from
11.3, and my transition to KDE from Gnome. Quirks certainly exist, but
for the most part it has been good. The biggest apprehension overall
was the switch from my trusted filesystems (XFS/ext4) to the new wonder
known as btrfs. As far as I know I have not lost any data, though I’ve
been pretty crazy about backups just in case something went wrong.
For the past couple of weeks the biggest concern I’ve had has been
related to my lousy hard drive. I have a Latitude E6410 which has been
a decent laptop for the most part, but as it is a laptop with a spinning
drive the disk I/O is basically terrible. I can’t fix, that, and I’ve
had this same laptop for a couple of years running 11.3 without much
pain, so probably not related to what I’m seeing now. So what am I
seeing? Basically, I suspect my disk is killing me and my ability to
work on the system. Simple things like changing workspaces/desktops
stalls when this happens. Changing tabs in browsers also slows down, as
down switching between application. Typically when this happens my hard
drive light is steadily on. I run ‘top’ 24x7 in a shell so I can catch
pestilences as quickly as possible and my processor is not usually be
worked hard at this time (Core i7 something or another, 8 GB RAM, not
usually pushing these resources overly hard). In the same display,
though, the %wa (waiting on I/O I believe) is high at these times.
Normally this is sitting at 0.0, maybe jumping up into the log
percentages from time to time. It bounces around as I do I/O-intensive
stuff, of course, but it seems to be happening a lot more now when I do
not expect it.
This morning @ 0800 I was working along and suddenly everything went
into super-sluggish mode. The hard drive light was on (for ten to
fifteen MINUTES), and snapper was running, which surprised me a little.
I had, about five minutes earlier, installed a new package (updated
wireshark from an OBS repository). The process list showed this had
started from cron, which was interesting. The snapper process had
started @ 0800 on its own which was pretty terrible timing for me since
it’s in the middle of my workday. Poking around in cron I found the
following files:
/etc/cron.daily/suse.de-snapper
/etc/cron.hourly/suse.de-snapper
Investigating further it appears that snapper is trying to do some
optimization regularly. The daily cron job tries to do three types of
jobs (depending on the snapper configuration per btrfs volume, which I
have not tuned/touched at all so far) including ‘NUMBER_CLEANUP’,
‘TIMELINE_CLEANUP’, and ‘EMPTY_PRE_POST_CLEANUP’. The hourly job only
tries to do the ‘TIMELINE_CLEANUP’. After seeing this I decided to see
what kind of snapshots I had since I have not created any manually:
sudo snapper list
It turns out I have quite a few. It’s exciting to know that Yast in
particular is taking snapshots of stuff as I use it which could be great
for rolling back stuff when I really screw up. On the other hand, I
suspect that the cleanup being done to remove empty snapshots is causing
my system to behave badly. One of the processes I regularly see when my
box is sluggish is named compare-dirs. I have not known for sure if it
was for some kind of desktop search indexing or if it was related to
btrfs, but looking now it resides in /usr/lib/snapper/bin/compare-dirs
which makes that clear. I suspect that part of the cleanup being done
involves compare-dirs going through and looking for things that may or
may not have changed and then doing some action based on that, but with
my laptop’s lame hard drive this is impacting my regular work. Running
the 3.x kernel I kind of hoped doing simultaneous tasks would behave a
little better, but that brings me to my point (finally!).
Do I need/want snapper working as often as it does? Sure, more
snapshots could mean more-granular restores when I break things, but I
am usually interested in performance more than the possible for granular
restores (I backup aggressively on my own). Does anybody know which of
these snapper operations is safest to disable and for which volumes?
Anybody with experience tuning these jobs? Any practices I can adopt to
minimize the impact of snapshots in general?
As a last note, when I setup my machine I did not partition /home
separately from the rest of the box (I think I made an error in that…
usually I’ve wanted all of my disk space to be available to me in both
/home as well as the rest of the filesystem so I just kept them
together; it appears btrfs can maybe handle this differently which
requires a bit of rethinking on my part). I suspect that this means my
snapshots are bigger than they need to be due to all of the changes in
my home directory (the only thing I care about in a disaster recovery
situation and which I already backup).
Any input is appreciated.
AB
P.S. Just noticed the “System stalling” thread from a couple minutes
ago… possibly related.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iQIcBAEBAgAGBQJPddR7AAoJEF+XTK08PnB5At0QAJZB7MrHaTEgaEYmq6S3wQmP
uq9GyLfjNIAZKUfwTg4v7ysJyg0Ykt2bi16GapKhgQ5QDabQ52kJEAH3QjcHW1Ac
nvHujzsbozE/oMyQlLwIOnw5V4QVF+T3FZZCFv1v2OflSczJ/29IZz8HTZlVKhZu
ZBv8iFLtEwKnqDw5wc5AT8MwbvJmptv6YL6ygnFFW9AGt7M3HWe4OqFpyK2cGgD7
nC199um5eq/GizHQHuxE5pONJbS9F0Fz7J38HRoQyVbft4TG70O2y3x5d4zfsa1z
2g1Ak8rgrBfWc4ZdYaOvQtPCzCb6vXTOXgzk1Oz5TmPe6EVnAIsDtGm3u5G4s+8j
ykglKbQBIUnRLt7Ov/puSMCgk4n2ROLy0U2cX8/CpQnfXXHgAHdoH63upCPho3ra
VVYgTYdzySaJuFdrFIY/rlNYbAyciZtc2cJ31xMALhMfalxbMY8BA7OMQDMsvvqG
CN+qa9m7hD4MtZkU9UkJ1dmIMLhLCZV/oQZgGStQUFBSJ8eUgXQ9IU3wPfD1TEmk
p320jhjkjeBrTdOf5Rdbf5re/+7RG6O32u75DJe3b13+vGSkr7dEDO9SdvnfFVNi
bG0YDpAVpjj8A1cyDz0PwRVXXsqcUfp7s6Mxqlvv8jxjRSUeD7Pum4EExt5QHzn+
4ly4umhNjwGjZlGltGwT
=Zi6X
-----END PGP SIGNATURE-----