6min 36.208s fstrim.service ?!

Hello,
I currently have a dualboot LEAP 15.1/W10 laptop and recently I had to re-install W10 without touching OpenSUSE. This is due to the fact that I was cloning from smaller SSD to larger SSD a couple of times for the boot drive, and it was once performed while my RAM module was failing.

There was an instability with W10, and I had to re-install everything and it’s gone smooth.

Something else I am noticing is that on LEAP 15.1 side, boot is rather slow. My root is on the SSD, and /home is on a platter HDD (cloned once from near-failing HDD to current one). Just now, my LEAP took forever to boot and I decided to investigate what’s going on.

The following came to my attention


**#** systemd-analyze blame
    6min 36.208s fstrim.service
         26.516s backup-rpmdb.service
          7.473s btrfsmaintenance-refresh.service
          6.697s vboxdrv.service
          6.433s logrotate.service
          6.430s mandb.service
          5.941s dracut-initqueue.service
          2.606s mnt-Shared_Data.mount
          1.031s postfix.service
           772ms display-manager.servic

6 minutes for fstrim?! Is this normal?

For additional information, none of my partitions are btrfs. They are all ext4 or ntfs.

**#** systemctl list-timers  
NEXT                          LEFT          LAST                          PASSED             UNIT                         ACTIVATES
Mon 2020-07-27 19:00:00 CEST  40min left    n/a                           n/a                snapper-timeline.timer       snapper-timeline.service
Tue 2020-07-28 00:00:00 CEST  5h 40min left Mon 2020-07-27 18:01:09 CEST  18min ago          logrotate.timer              logrotate.service
Tue 2020-07-28 00:00:00 CEST  5h 40min left Mon 2020-07-27 18:01:09 CEST  18min ago          mandb.timer                  mandb.service
Tue 2020-07-28 01:11:08 CEST  6h left       Mon 2020-07-27 18:01:16 CEST  18min ago          backup-sysconfig.timer       backup-sysconfig.service
Tue 2020-07-28 01:40:13 CEST  7h left       Mon 2020-07-27 18:01:09 CEST  18min ago          check-battery.timer          check-battery.service
Tue 2020-07-28 01:58:38 CEST  7h left       Mon 2020-07-27 18:01:09 CEST  18min ago          backup-rpmdb.timer           backup-rpmdb.service
Tue 2020-07-28 18:16:29 CEST  23h left      Mon 2020-07-27 18:16:29 CEST  2min 49s ago       systemd-tmpfiles-clean.timer systemd-tmpfiles-clean.service
Sat 2020-08-01 00:00:00 CEST  4 days left   Mon 2020-07-27 18:01:09 CEST  18min ago          btrfs-balance.timer          btrfs-balance.service
Sat 2020-08-01 00:00:00 CEST  4 days left   Wed 2020-07-01 16:06:47 CEST  3 weeks 5 days ago btrfs-scrub.timer            btrfs-scrub.service
Mon 2020-08-03 00:00:00 CEST  6 days left   Mon 2020-07-27 18:01:09 CEST  18min ago          fstrim.timer                 fstrim.service
n/a                           n/a           n/a                           n/a                snapper-cleanup.timer        snapper-cleanup.service

Could someone let me know how I can disable btrfs-scrub and btrfs-balance from the schedule while we’re at it? They are both disabled and inactive according to systemd but timer says it’s being used… somehow.

Also, my fstab looks as follows:

**#** cat /etc/fstab
UUID=df520a3e-1837-4ebb-8535-c9ca32504fc5  /                 ext4  acl,user_xattr               0  1
UUID=cb2f0bc6-9c49-42a9-a797-072702eb62c1  /home             ext4  data=ordered,acl,user_xattr  0  2
UUID=3216-F39E                             /boot/efi         vfat  defaults                     0  0
UUID=B0B4915CB491263E                      /mnt/Shared_Data  ntfs  defaults                     0  0

Can someone help me completely disable btrfs related maintenance on my LEAP 15.1 installation? I hope that’s the only cause of the slow boot.

Thanks!
-SJL

FYI, I did look at https://en.opensuse.org/SDB:Disable_btrfsmaintenance


/etc/sysconfig/btrfsmaintenance
BTRFS_BALANCE_PERIOD="none"
BTRFS_SCRUB_PERIOD="none"

worked perfectly but I am wondering if I can do this from systemctl.

Obviously the BTRFS balance and scrub has ran on my ext4 several times already. What kind of effect would it have?

Well, looking at YaST > System > Services Manager, I see btrfs-balance, btrfs-scrub and btrfs-trim (all set to “manual”, which imho means “off”, in my case)
I assume that apart from uing YaST you can use systemctl on those when you prefer that.

I have no idea. I vaguely remember taht it was “on” after installation on my system, ut I haven’t any Btrfs. Now even if this is switched on ny default (in case someone starts using a Btrfs file system, but forgetting to switch them on), it should bail out immediatly when started in a system without any Btrfs. Just my idea.

I used the configuration file /etc/sysconfig/btrfsmaintenance to set scrub and balance to none, then disabled [FONT=arial]btrfsmaintenance-refresh from systemctl and everything seems alright now.
[/FONT]

systemctl disable btrfsmaintenance-refresh

[FONT=arial]
Now there is no ~6 mins delay in boot anymore. Everything is more-or-less the way they should be. I am still shocked at the >6mins fstrim on boot. Apparently the BTRFS balance was running along with fstrim. I wonder what was happening… This was on EXT4 partition.
[/FONT]

The reason I favour using systemd rather than YAST in most cases is because I’ve been collecting a pile of “To run” on fresh-install script to fully customise the installation to my preferences. I’m glad that such features are available on Yast.

Hi
To ensure it is really disabled, it pays to mask (send to dev null) as well…


systemctl mask btrfsmaintenance-refresh 

What I tried to show is that you can get from YaST the information you asked for. After that you can of course use what you prefer.

That’s something new to me. Great! Thanks.

Yes, of course. But IMHO the questions are:

  • why is it running at all on a system without any btrfs file system in use?
  • why does it take 6 mins. to do nthing useful.

To me it looks if there might be bugzilla cases here, but I am not sure. Maybe someone has a plausible explanation.

Sorry, my post above assumed that that was included in your systemctl knowledge.

It makes @malcolmlewis post the more valuable.

btrfsmaintenance-refresh.service can cause reloading systemd which in turn can cause numerous symptoms. If you don’t have btrfs on your system remove and lock package btrfsmaintenance. If you want to use it fix it: SDB:Fix btrfsmaintenance-refresh - openSUSE Wiki

That is all nice, but isn’t it an error of some sort that should be cured?

Why should anything doing maintenance on BTRFS run on a system that hasn’t it in use?
And when it does foe whatever reason (maybe a left over, or some other glitch), shouldn’t it first and foremost check if there is any BTRFS file system and then bail out with a few millisecs?

Yes it should and the fix is pending (forever?): 1165780 – Reduce the number of PID1 reloading triggered by btrfsmaintenance

I do not understand much of it. But to me it seems that they are busy to repair somethimng when using btrfsmaintenance. Which might be a good thing in itself for those who use Btrfs.

But for those who don’t use it, there should not be any btrfsmaintenance run whatsoever. And while I admit I only glanced through it, I did not see any reference to somethinmg like: do not use any of these btrfs things when not needed.

Not along – after. The btrfs-maintenance routines are pulling fstrim-service, they don’t run concurrently, thankfully. You won’t have fs-trim at boot, but should have it run on a timer instead. It’s probably on. Still I’d look at “sudo journal -u fstrim” because it’s taking quite a lot of time.

  1. the rationale is someday the user might have it on the system.
  2. It’s a lot of time for a SSD. Maybe it’s misidentifying the HDD as a fstrim-able device?

I believe it doesn’t address the case when no btrfs is found in the system.

From /etc/sysconfig/btrfsmaintenance:

# Which mountpoints/filesystems to balance periodically. This may reclaim unused
# portions of the filesystem and make the rest more compact.
# (Colon separated paths)
# The special word/mountpoint "auto" will evaluate all mounted btrfs
# filesystems
BTRFS_BALANCE_MOUNTPOINTS="/"

I’m sure your fstrim service is running as long as it needs to run.
This means that disabling the service or functionality is shortsighted and will lead to problems after a period of time (depends on your read/write activity on your SSD).
The eventual consequence is that your SSD will be filled with traps marked for file deletion, and until “trimmed” won’t be available for re-use.

When you run out of available traps to write to,
It’s the same as filling up your disk until you run out of space and can result in being unable to boot.

So, the better recommendation is to

  1. You can manually run a utility that clears the traps, there are other tools than fstrim or you can manually invoke the service when it won’t disturb your work.
  2. Re-enable the service and just wait for the traps to be cleared. The long wait should happen only once unless you do massive read/writes. Considering this is your root partition or volume, you should focus on server apps or other apps which run in your root partition rather than your /home.

TSU

TSU, just to make it clear. My SSD does get regularly trimmed and the operation always takes under 1 minute. I just started noticing slow boot recently and based on

**#**** systemctl list-timers  **

apparently fstrim was running with btrfs-scrub. Started and ended together. Fstrim taking forever as a result from some form of conflict I didn’t investigate. I did not disable fstrim, I only disabled and blocked btrfs maintenance features and everything is operating normally.

I should also mention that I have no btrfs partitions at all.

Just a follow up. The laptop recently booted with fstrim lasting ~11 minutes on LEAP 15.1. With BTRFS services disabled. It’s quite hard for me to understand why because there has not been major write/delete on the SSD in a long time. In fact, the laptop was stored away for a week and I’ve only used the W10 (boot SSD shared between W10, TW and LEAP 15.1).

Okay,

I am starting to get irritated because fstrim ~7 minutes is happening quite regularly now on LEAP 15.2 and LEAP 15.1 both with ext4 / on SSD.

I do not wish to disable fstrim but it is possible that the mountpoints are causing the fstrim to try to “trim” HDD portions as well. Will you please let me know if any of the followings are issues:

  1. My /home is a separate ext4 partition and mounted at /home by fstab on boot
  2. There is a symbollic link at / pointing to a directory within ext4 /home partition. This is for a proprietary software and is a requirement from the vendors who distributed the software.

On both LEAP 15.1 and 15.2 ext4 installations, I have the BTRFS features turned off, masked and snapper features locked. So far I have not experienced the issue on my TW installation and it is a BTRFS installtion with only grub snapper featured removd and locked.

I have a feeling the symbollic link causing the issue.