Squeky machine due to disk i/o

I thought I had squeaky fan but it seems to be a disk i/o that happens every few seconds. Then I decided it was a disk access so looked for code to see what was actually doing the i/o. Results


dhcppc0:/home/john/DVB # iotop -o -b -d 5
Total DISK READ :       0.00 B/s | Total DISK WRITE :       0.00 B/s
Actual DISK READ:       0.00 B/s | Actual DISK WRITE:       0.00 B/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
Total DISK READ :       0.00 B/s | Total DISK WRITE :       6.37 K/s
Actual DISK READ:       0.00 B/s | Actual DISK WRITE:      13.74 K/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
 1053 be/3 root        0.00 B/s    6.37 K/s  0.00 %  0.65 % [jbd2/md0-8]
28668 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.36 % [kworker/0:3]
26217 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.31 % [kworker/0:5]
30520 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.26 % [kworker/0:7]
28264 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.26 % [kworker/0:4]
23015 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.26 % [kworker/0:0]
24790 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.26 % [kworker/0:6]
30972 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [kworker/1:2]
cC=Total DISK READ :       0.00 B/s | Total DISK WRITE :       0.00 B/s
Actual DISK READ:       0.00 B/s | Actual DISK WRITE:       7.34 K/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
20173 be/4 john        0.00 B/s    0.00 B/s  0.00 %  1.21 % firefox [mozStorage #1]
 1053 be/3 root        0.00 B/s    0.00 B/s  0.00 %  0.65 % [jbd2/md0-8]
30520 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.56 % [kworker/0:7]
28668 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.46 % [kworker/0:3]
28264 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.41 % [kworker/0:4]
24790 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.36 % [kworker/0:6]
26217 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.35 % [kworker/0:5]
23015 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.31 % [kworker/0:0]
30972 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [kworker/1:2]
^Cdhcppc0:/home/john/DVB # 

Also a suggestion to look at this


^Cdhcppc0:/home/john/DVB cat /proc/vmstat | grep -E "dirty|writeback"
nr_dirty 196
nr_writeback 0
nr_writeback_temp 0
nr_dirty_threshold 994711
nr_dirty_background_threshold 496748

I that good bad or indifferent ? I’m not sure what partition these are to? Anyone know? They may be going to a disk that doesn’t have a write cache.

One thing for sure it’s an irritating noise.

John

If your disk is making unusual noises, it’s time to install and run smart-tools to analyze your drive.

TSU

The actual package I’m referring to is smartmontools, the following will install

zypper in smartmontools

TSU

That’s why I asked which partitions are these likely to be written too - I’d still like to know.

I have system files and apps on ssd apart from swap, /var and /tmp which are on a rather odd disk HP supplied with my machine. Smart shows no errors however the disk seems to achieve this in a rather odd fashion by using parity recovery so shows no actual errors. It’s also set to not use a cache or doesn’t have one.

Squeaky might not have been the right word to use even if my spelling was correct. It’s a disk access noise that could be mistaken for a squeaky fan etc running at low speed.

The other drives on my system are brand new and just contain my /home directory.

The period of the “squeaks” seems to be down to this.


dhcppc0:/home/john # sysctl -a | grep dirty                                                                                                            
    <snip> ipv6 related
vm.dirty_background_bytes = 0
vm.dirty_background_ratio = 10
vm.dirty_bytes = 0
vm.dirty_expire_centisecs = 3000
vm.dirty_ratio = 20
vm.dirty_writeback_centisecs = 500
vm.dirtytime_expire_seconds = 43200
vm.pagecache_limit_ignore_dirty = 1

In other words waking up every 5 secs to see if there is anything to do. My machine does seem to be blocking at times, mostly noticeable if firefox where I find I am typing ahead at times - with this sort of level of memory usage


dhcppc0:/home/john # free
             total       used       free     shared    buffers     cached
Mem:      24616728    9787436   14829292     145848    1000860    3796140
-/+ buffers/cache:    4990436   19626292
Swap:     10481660          0   10481660

:'(I don’t understand this area at all but the other related output is this one


dhcppc0:/home/john # cat /proc/vmstat | egrep "dirty|writeback"
nr_dirty 315
nr_writeback 0
nr_writeback_temp 0
nr_dirty_threshold 854970
nr_dirty_background_threshold 426963

So my machine is accessing a disk every 5 secs but despite that will block at times. I’ve also noticed a change in memory usage. Now the only way to get back to circa 2.2gb is to reboot. This seems to relate to a recent upgrade. A log in and log out of the desktop drops it somewhat but not by much. I’m currently running vlc, firefox, 1 dolphin and 1 console. Firefox on it’s own will push things up to 5 or 6 gig which also seems to be a recent change.

John

I would like to know if these are the usual default settings for leap 42.2 in case setting up for running without swap has changed them and then adding swap back didn’t put them back as they should be.


dhcppc0:/home/john # sysctl -a | grep dirty
sysctl: reading key "net.ipv6.conf.all.stable_secret"
sysctl: reading key "net.ipv6.conf.default.stable_secret"
sysctl: reading key "net.ipv6.conf.eth0.stable_secret"
sysctl: reading key "net.ipv6.conf.lo.stable_secret"
sysctl: reading key "net.ipv6.conf.wlan0.stable_secret"
vm.dirty_background_bytes = 0
vm.dirty_background_ratio = 10
vm.dirty_bytes = 0
vm.dirty_expire_centisecs = 3000
vm.dirty_ratio = 20
vm.dirty_writeback_centisecs = 500
vm.dirtytime_expire_seconds = 43200
vm.pagecache_limit_ignore_dirty = 1


Also what pagecache_limit_ignore_dirty = 1 does and any alternative settings. I haven’t managed to find any info on that one on the web at all.

John

The answer to that is here

http://www.it165.net/uploadfile/2013/0124/20130124083245144.txt

and as the page memory limit mentioned is set to zero it doesn’t do anything. Does anyone know when this facility can be useful ?

John

So, from what you have described, only swap, /var and /tmp are on a HDD which is suspected of being the cause of some disturbing noises.

The system (/, /usr, /etc, /bin, /lib, /opt, and so on) is on an SSD – it’s not not vary likely that, an SSD could be the cause of any noises at all.

Your home directories are on a new drives, presumably HDDs, and are not really suspected due to their newness – unless of course you have a “Monday drive” in there somewhere – in other words a DOA (Dead On Arrival) . . .

What you could try is, to move everything that’s on the suspect HDD to some spare space on one of the /home drives, and then see if the disturbing noise disappears.

I should have called this thread why is one of my disks being accessed every 5 secs. Following a lot of web searches I posted the reason earlier. It’s this. It’s a noisy disk when accesses are very short that could be mistaken for a slow speed squeaky fan.


dhcppc0:/home/john # sysctl -a | grep dirty
vm.dirty_background_bytes = 0
vm.dirty_background_ratio = 10
vm.dirty_bytes = 0
vm.dirty_expire_centisecs = 3000
vm.dirty_ratio = 20
vm.dirty_writeback_centisecs = 500
vm.dirtytime_expire_seconds = 43200
vm.pagecache_limit_ignore_dirty = 1

It’s down to the dirty writeback setting which in practical terms given usual use with 24gB of memory doesn’t do anything most of the time.

It’s turned out to an interesting area as the i/o on my machine has blocked a few times. It also did this on 12.3. In that case a log out and back in cured it. The usual sign on 12.3 was typing ahead with significant delays before the letters came up always in a browser. The same thing has happened once so far on leap. This quote from the arch wiki may explain why this happens.

Virtual memory

There are several key parameters to tune the operation of the virtual memory (VM) subsystem of the Linux kernel and the write out of dirty data to disk. See the official Linux kernel documentation for more information. For example:

  • vm.dirty_ratio = 3

Contains, as a percentage of total available memory that contains free pages and reclaimable pages, the number of pages at which a process which is generating disk writes will itself start writing out dirty data.

  • vm.dirty_background_ratio = 2

Contains, as a percentage of total available memory that contains free pages and reclaimable pages, the number of pages at which the background kernel flusher threads will start writing out dirty data. As noted in the comments for the parameters, one needs to consider the total amount of RAM when setting these values. For example, simplifying by taking the installed system RAM instead of available memory:

  • Consensus is that setting vm.dirty_ratio to 10% of RAM is a sane value if RAM is say 1 GB (so 10% is 100 MB). But if the machine has much more RAM, say 16 GB (10% is 1.6 GB), the percentage may be out of proportion as it becomes several seconds of writeback on spinning disks. A more sane value in this case is 3 (3% of 16 GB is approximately 491 MB).
  • Similarly, setting vm.dirty_background_ratio to 5 may be just fine for small memory values, but again, consider and adjust accordingly for the amount of RAM on a particular system.

Another parameter is:

  • vm.vfs_cache_pressure = 60

The value controls the tendency of the kernel to reclaim the memory which is used for caching of directory and inode objects (VFS cache). Lowering it from the default value of 100 makes the kernel less inclined to reclaim VFS cache (do not set it to 0, this may produce out-of-memory conditions).

Note that leap sets 20% not the 10% they mention so leap may be even more suitable for low memory systems. I have seen suggestions that the best setting is one that is a bit less than the disk’s cache but that seems a bit extreme. So far I haven’t seen any suggestions on how often the need to write dirty caches should be checked. I would have thought the limits would play a part but it could be that these are currently checked every 5 secs

:\Leaves me wondering what it is writing to disk.

As also mentioned - my current settings may be down to having run the machine without a swap partition for a while and then not being corrected when I added one back again. I was just checking that I could do this before replacing this very disk actually. There is nothing wrong with it but best change before there is an the extra space on a new drive will be useful.

John

:beat-up: I decided to look to see what was been written. Found that I could only look at sector counts and then noticed that the writes are actually to my brand new /home raid array. The fact that they are every 5 secs matching dirty cache timing seems to be a fluke. :slight_smile: not time wasted though as it’s interesting.

Anyway looking a bit further the culprit seems to be called jbd2/md0-8. A search on the web suggests 2 reasons. I used ext4 for the format and it seems that this is a partial format that is completed while the system is up and running so stops eventually. 2 1tB disks in raid 1 should have finished by now. The disks took ages to sync when the raid was built.

The other reason mentioned is the need for noatime on the mount in fstab to get rid of the problem, again down to using ext4.

Can some one shed some light on this for me. I’m not even sure if noatime just needs adding to the end of the line in fstab or in real terms something else needs changing.

The current mount is


/dev/md0             /home               ext4       acl,user_xattr        1 2

I’ve used ext4 as I always import my partitioning when I install. My /home has been on a raid for a long time and have never noticed this before and has always been set up by YAST. The only change is the type of disk. West’ Dig’ red rather than 10k enterprise sata I used previously.

John

comma notime with the rest of the attributes

acl,user_xattr,notime

I had to check but user_xattr Does not appear to be used with ext4 just ext2 and ext3. In any case it is not needed

check man mount

I read it as being backward compatible with ext2 and 3. :wink: I’m not inclined to change what yast does too so think I’ll leave it in.

Adding noatime has changed things but it still happens. Yet another google bought up a debate on the problem as it can stop disks spinning down. The best posts I could fine are here

https://bbs.archlinux.org/viewtopic.php?id=113516

The fix of adding COMMIT and adjusting power down timing ( if I have understood correctly ) seems like a bit of a kludge to me. The other one of using tmpfs is interesting but not that compatible with more use of user space for current desktop settings and data for the apps etc. So any time I use firefox or change something it has to be journalled. This is a log of what it’s doing now which is an improvement.


dhcppc0:/home/john # iotop -obtqqq | grep jbd2
10:34:26  1062 be/3 root        0.00 B/s    7.68 K/s  0.00 %  3.59 % [jbd2/sdb2-8]
10:36:12  1047 be/3 root        0.00 B/s   31.06 K/s  0.00 %  3.58 % [jbd2/md0-8]
10:36:20  1047 be/3 root        0.00 B/s    7.76 K/s  0.00 %  3.59 % [jbd2/md0-8]
10:36:37  1047 be/3 root        0.00 B/s    0.00 B/s  0.00 %  3.69 % [jbd2/md0-8]
10:36:52  1047 be/3 root        0.00 B/s   11.66 K/s  0.00 %  3.72 % [jbd2/md0-8]
10:37:03  1047 be/3 root        0.00 B/s   31.17 K/s  0.00 %  3.77 % [jbd2/md0-8]
10:37:38  1047 be/3 root        0.00 B/s   11.52 K/s  0.00 %  3.73 % [jbd2/md0-8]
10:37:47  1047 be/3 root        0.00 B/s   82.16 K/s  0.00 % 14.58 % [jbd2/md0-8]
10:37:48  1047 be/3 root        0.00 B/s    7.85 K/s  0.00 % 14.48 % [jbd2/md0-8]
10:37:55  1047 be/3 root        0.00 B/s    7.74 K/s  0.00 %  3.80 % [jbd2/md0-8]
10:38:05  1047 be/3 root        0.00 B/s   11.65 K/s  0.00 %  3.81 % [jbd2/md0-8]
10:38:19  1047 be/3 root        0.00 B/s  137.05 K/s  0.00 % 26.20 % [jbd2/md0-8]
10:38:25  1047 be/3 root        0.00 B/s  135.57 K/s  0.00 % 10.53 % [jbd2/md0-8]
10:38:36  1047 be/3 root        0.00 B/s   11.71 K/s  0.00 %  3.38 % [jbd2/md0-8]
10:38:48  1062 be/3 root        0.00 B/s    3.93 K/s  0.00 %  4.98 % [jbd2/sdb2-8]
10:38:54  1062 be/3 root        0.00 B/s   39.24 K/s  0.00 %  1.05 % [jbd2/sdb2-8]
10:38:59  1047 be/3 root        0.00 B/s   39.16 K/s  0.00 %  3.98 % [jbd2/md0-8]
10:39:11  1047 be/3 root        0.00 B/s   19.26 K/s  0.00 %  3.46 % [jbd2/md0-8]
10:39:23  1062 be/3 root        0.00 B/s   15.64 K/s  0.00 %  4.95 % [jbd2/sdb2-8]
10:39:29  1062 be/3 root        0.00 B/s    0.00 B/s  0.00 %  0.53 % [jbd2/sdb2-8]
10:39:45  1047 be/3 root        0.00 B/s   11.75 K/s  0.00 %  3.94 % [jbd2/md0-8]
10:40:06   526 be/3 root        0.00 B/s    3.88 K/s  0.00 %  0.65 % [jbd2/sda2-8]
10:40:09  1047 be/3 root        0.00 B/s   19.41 K/s  0.00 %  3.64 % [jbd2/md0-8]
10:40:40  1047 be/3 root        0.00 B/s   11.72 K/s  0.00 %  3.74 % [jbd2/md0-8]
10:41:49  1047 be/3 root        0.00 B/s   31.17 K/s  0.00 %  3.76 % [jbd2/md0-8]
10:42:04  1047 be/3 root        0.00 B/s    0.00 B/s  0.00 %  3.76 % [jbd2/md0-8]
10:42:32  1047 be/3 root        0.00 B/s   30.93 K/s  0.00 %  3.24 % [jbd2/md0-8]
10:42:47  1047 be/3 root        0.00 B/s    0.00 B/s  0.00 %  3.32 % [jbd2/md0-8]
^C

:'(This may be being journalled as I type it. md0 is my /home. the raid is sdc and sdd. sdb2 is ./var and sda2 is / on a flash drive. As sda2 is journalled some irritating piece of code must be writing to it but not often. Does anyone know what that might be?

md0 sees the most changes so that will be the one that gets a dose of jbd2 most often.

All drives and partitions are ext4 apart from boot. :slight_smile: I don’t suppose I can reformat without loosing data ? On the other hand maybe it’s best to just put up with it as I don’t allow my disks to spin down and maybe that aspect has been fixed now.

Any informative views ? I could move home around and reformat the raid ? My concern is that this area will reduce disk life. Actually I think it has on my previous raid.

John