Extremely slow boot for new tumbleweed gnome installation

I recently installed OpenSUSE Tumbleweed with Gnome desktop on my laptop and booting takes about 30 minutes. I know that when I install with KDE instead of Gnome, it boots way faster(about a minute), but still much slower than Manjaro(about 15 seconds). The only things that I customized during the installation was to increase the swap size, set noatime instead of relatime, and enable lzo compression. I saw another thread where someone was having minor boot issues and they asked for the results of

systemd-analyze critical-path

, so I have included it below:


graphical.target @10min 8.916s
└─multi-user.target @10min 8.916s
  └─getty.target @10min 8.916s
    └─getty@tty1.service @10min 8.916s
      └─systemd-user-sessions.service @4.125s +5ms
        └─remote-fs.target @4.124s
          └─iscsi.service @4.115s +8ms
            └─iscsid.service @4.093s +21ms
              └─network.target @4.092s
                └─NetworkManager.service @4.062s +29ms
                  └─network-pre.target @4.061s
                    └─firewalld.service @3.463s +597ms
                      └─polkit.service @3.517s +266ms
                        └─basic.target @3.407s
                          └─paths.target @3.407s
                            └─issue-generator.path @3.407s
                              └─sysinit.target @3.406s
                                └─apparmor.service @421ms +2.984s
                                  └─var.mount @412ms +7ms
                                    └─local-fs-pre.target @406ms

Any help would be greatly appreciated. I am not sure if this is a software bug or hardware support issue, but something is definitely wrong. Support would be greatly appreciated.

**systemd-analyze plot **might give you more information.

https://doc.opensuse.org/documentation/leap/reference/html/book.opensuse.reference/cha.systemd.html#sec.boot.systemd.debug.time

I’ve been researching ways to improve startup times on my systems (Atati ST, classic MacOS, OS X, Linux) during decades, mostly out of curiosity.

With my most recent openSUSE-Leap-based main desktop (a 5 years old Core i5, Mini-ITX), I’ve tried concentrating on minimalism and simplicity right from the start, and I began systematically reducing boot times while experimenting:
[ul]
[li]6,235s — Leap 42.2 with wicked, IPv6 and Plymouth disabled
[/li][li]3.737s — replace wicked with NetworkManager (and later: use only systemd-networkd with fixed IPv4 address)
[/li][li]3.528s — replace Postfix with Exim
[/li][li]2.992s — ntpd off, smartd off, irqbalance off
[/li][li]2.691s — replace sddm with no-frills minimally configured kdm
[/li][li]2.567s — cupsd off (not doing any printing)
[/li][li]2.326s — reduce grub2 output and kernel/console output to errors only (see below)
[/li][li]2.127s — switch from noop scheduling to deadline
[/li][li]etc, etc… see below.
[/li][/ul]
This lead me to collate a little list of some things to try for you, if you’re so inclined:
[ul]
[li]use one fast SSD — I still think this and the dracut stuff below may been the greatest improvement to any system
[/li][li]use one MBR-style, primary, properly block-aligned ext4 »/« partition (I have no use for btrfs, makes things really simple)
[/li][li]mount options: noatime,acl,user_xattr (no automatic discard/trim, I kick off fstrim manually after each full backup once a month)
[/li][li]no swap, because I don’t use hibernation/sleep/standby
[/li][li]disable Plymouth (unnecessary eye candy)
[/li][li]disable AppArmor (arguably no evidence of additional security)
[/li][li]use unthemed kdm as a display manager (much faster initialization times compared to gdm and sddm)
[/li][li]optimize your graphical session accordingly: no startup animations, no unnecessary agents/frills/helpers/gadgets/widgets/animations/eye candy
[/li][li]use exim (postfix takes a while to initialize) or disable local mail processing altogether
[/li][li]try using [b]systemd-networkd[/b] only (I don’t need wicked’s complexity and slow boot-time initialisations, and I don’t usually use a WLAN at home, so no NetworkManager too)
[/li][li]IF you have a separate firewall: disable SuSEfirewall and other iptables-based filters
[/li][li]in YaST’s »Services Manager«, disable as many services as you are comfortable with, for example bluez, smartd, ModemManager, iscsi, nmb+Samba and — again — Plymouth.
[/li][li]make the kernel just write warnings or errors during boot onto the screen, nothing else; relevant kernel parameters:[/li]```
loglevel=4 systemd.show_status=auto


[li]customize your initial RAM-disk ([i]initrd[/i]) with [b]dracut[/b], e.g. issuing as root user something like: [/li]```
dracut --hostonly --force --omit "img-lib cifs fcoe fcoe-uefi multipath iscsi qemu lvm mdraid dm dmraid pollcdrom plymouth btrfs samba"

Caution! Your needs may differ from mine! — If everything works after a rebooting and testing, make it permanent in a custom dracut config file like /etc/dracut.conf.d/01-my-own-dracut.conf:

hostonly="yes"
compress="cat"

omit_dracutmodules+=" network kernel-network-modules ifcfg img-lib cifs fcoe fcoe-uefi rdma multipath iscsi qemu lvm mdraid dm dmraid cdrom pollcdrom plymouth btrfs wacom convertfs wicked ipv6 mtp-probe warpclock i18n "

omit_drivers+=" usb-storage uas ums-* snd soundcore snd-* hid-wiimote wacom hv_vmbus rmi_core dm-mod iscsi_if iscsi_tcp dm_multipath parport pcmcia jsm cdrom serial "
[li]If you’re not a kernel developer, reduce sizes of kernel objects; as root, do a [/li]```
find /lib/modules -name *.ko -exec strip --strip-unneeded {} +

… this helps dracut to build slightly smaller initrd’s.


[/ul]
Every change helped a bit; the SSD and dracut even in a dramatical fashion, as already mentioned.
So this is my boot from this morning, it’s a pretty typical one:

rig:~ :arrow_forward: systemd-analyze; systemd-analyze blame
Startup finished in 231ms (kernel) + 687ms (initrd) + 387ms (userspace) = 1.306s
110ms systemd-journal-flush.service
89ms udisks2.service
81ms upower.service
77ms display-manager.service
73ms systemd-udevd.service
71ms polkit.service
47ms klog.service
31ms systemd-udev-trigger.service
30ms systemd-tmpfiles-setup.service
21ms user@1000.service
20ms systemd-tmpfiles-setup-dev.service
19ms systemd-update-utmp.service
16ms systemd-modules-load.service
14ms dev-hugepages.mount
14ms systemd-logind.service
13ms systemd-tmpfiles-clean.service
13ms systemd-fsck-root.service
12ms systemd-remount-fs.service
11ms sys-kernel-debug.mount
10ms systemd-sysctl.service
10ms systemd-networkd.service
9ms dev-mqueue.mount
8ms systemd-journald.service
6ms systemd-random-seed.service
5ms dracut-shutdown.service
4ms systemd-update-utmp-runlevel.service
3ms systemd-user-sessions.service
3ms rtkit-daemon.service
2ms kmod-static-nodes.service
1ms systemd-vconsole-setup.service
rig:~ :arrow_forward: _


The »critical chain«:

rig:~ :arrow_forward: systemd-analyze critical-chain
graphical.target @382ms
└─display-manager.service @305ms +77ms
└─systemd-logind.service @290ms +14ms
└─basic.target @249ms
└─sockets.target @248ms
└─dbus.socket @248ms
└─sysinit.target @248ms
└─systemd-update-utmp.service @228ms +19ms
└─systemd-tmpfiles-setup.service @197ms +30ms
└─systemd-journal-flush.service @86ms +110ms
└─systemd-remount-fs.service @71ms +12ms
└─systemd-fsck-root.service @584542y 2w 2d 20h 1min 49.362s +13ms
└─systemd-journald.socket
└─-.mount
└─system.slice
└─-.slice
rig:~ :arrow_forward: _



Finally, using…

systemd-analyze plot > boot.svg

 … and using Gimp to convert the SVG to PNG, here’s the accompanying boot plot:

[IMG]https://susepaste.org/images/87486322.png[/IMG]

My best time was a reboot several months ago: 1.211 seconds — a fluke, I guess, when systemd just had its boot jobs arranged perfectly, and with optimal concurrence. 
(The fact that a reboot usually re-uses a certain momentum in an already running system with initialized components and full caches etc may have helped too.)
My worst (re)boot time is usually directly after kernel updates, when an additional [i]purge-kernel[/i] job uses several seconds for housekeeping. 

I realize that several of my actions may be an absolute [i]no-no[/i] for you guys. For example, you may have external, network-accessible storage, and you may need Samba, ntpd, IPv6, iSCSI etc for that. (I use only SSH for transfers, and I start sshd only on-demand). But I think it’s fun to have at least one system for testing those things out, for optimizing, of course also for learning about all those intricacies and adding knowledge to the personal toolbox.
Cheers!

Based on the plot the 3 services that are taking forever to boot are:

plymouth-quit-wait.service
btrfsmaintenance-refresh.service
chrony-wait.service

These service take about 10 minutes to boot. The question is, any idea why they are extremely slow and how can I debug/fix them? My laptop has a nice ssd and can boot in less than 15 seconds on Manjaro Linux. Obviously, this is an issue with openSUSE and its service. I don’t think this is a boot time optimization issue, but a bug of some form.

Did you activate ntp ? If so, disable it. I’ve seen more reports where chrony and ntp were both installed resulting in long boot times.

I didn’t intentionally activate it when installing, but that was definitely the issue. Thanks!

@unix111

Thanks for the info. I barely started researching this topic.

-H, –hostonly
Host-Only mode: Install only what is needed for booting the local host instead of a generic host.

I’m sure this is obvious to someone, but I’m not understanding what a “local host” and a “generic host” is in this case.

I’m also confused when you say “make it permanent in a custom dracut config”. I thought the dracut command does make those things permanent.

This would make a very useful wiki article. wink wink nudge nudge

These are the times from cold boot:


erlangen:~ # systemd-analyze 
Startup finished in 18.362s (firmware) + 4.037s (loader) + 1.523s (kernel) + 1.155s (initrd) + 2.624s (userspace) = 27.703s 
graphical.target reached after 2.619s in userspace
erlangen:~ # systemd-analyze blame |head -22
           800ms systemd-networkd.service
           693ms display-manager.service
           632ms udisks2.service
           499ms firewalld.service
           244ms smartd.service
           236ms systemd-udevd.service
           236ms postfix.service
           222ms initrd-switch-root.service
           206ms systemd-logind.service
           153ms systemd-journald.service
           134ms issue-generator.service
            97ms home\x2dHDD.mount
            83ms iscsid.service
            77ms systemd-journal-flush.service
            75ms initrd-parse-etc.service
            72ms upower.service
            72ms kbdsettings.service
            68ms apache2.service
            51ms systemd-udev-trigger.service
            46ms user@1000.service
            38ms auditd.service
            35ms polkit.service
erlangen:~ # systemd-analyze critical-chain 
The time when unit became active or started is printed after the "@" character.
The time the unit took to start is printed after the "+" character.

graphical.target @2.619s
└─display-manager.service @1.926s +693ms
  └─apache2.service @1.855s +68ms
    └─remote-fs.target @1.854s
      └─iscsi.service @1.851s +3ms
        └─iscsid.service @1.767s +83ms
          └─network.target @1.763s
            └─systemd-networkd.service @962ms +800ms
              └─network-pre.target @961ms
                └─firewalld.service @462ms +499ms
                  └─dbus.service @459ms
                    └─basic.target @456ms
                      └─sockets.target @456ms
                        └─iscsid.socket @456ms
                          └─sysinit.target @455ms
                            └─systemd-update-utmp.service @447ms +7ms
                              └─auditd.service @408ms +38ms
                                └─systemd-tmpfiles-setup.service @377ms +30ms
                                  └─systemd-journal-flush.service @298ms +77ms
                                    └─systemd-journald.service @144ms +153ms
                                      └─haveged.service
                                        └─systemd-journald.socket
                                          └─-.mount
                                            └─systemd-journald.socket
                                              └─...
erlangen:~ # 

Optimization made: zypper rm --clean-deps plymouth-*.

My understanding is that it’s about the range of supported drivers, kernel modules and boot-time services included in the initial RAM disk (initrd). »Generic host« would mean every openSUSE supported hardware with any known filesystem, from tiny Raspberry PIs to huge datacenter servers; which leads to quite large initrd images.

Want to boot via IPv6, Bluetooth or WLAN? Or the ability to boot from any filesystem, be it ext2/3/4, btrfs, reiserfs, XFS, bcache, ZFS etc? Boot via USB or CD/DVD? Then your dracut needs several USB and cdrom drivers. Or from virtual, clustered partitions, via LVM/SAN/FibreChannel/dmraid/mdraid? Then you need all that boot-time SLES/datacenter stuff. Need early soundcard access for adding entropy to the random number generator (urandom)? Then sound drivers need to be included in your initrd.

If not, then this can be an opportunity to optimize. From the dracut manpage:

If you want to create lighter, smaller initramfs images, you may want to specify the --hostonly or -H option.
Using this option, the resulting image will contain only those dracut modules, kernel modules and filesystems,
which are needed to boot this specific machine. This has the drawback, that you can’t put the disk on another
controller or machine, and that you can’t switch to another root filesystem, without recreating the initramfs
image. The usage of the --hostonly option is only for experts and you will have to keep the broken pieces. At
least keep a copy of a general purpose image (and corresponding kernel) as a fallback to rescue your system.

You’re right, I expressed myself poorly. No matter if via command line or dracut config, the initrd images you create (and zypper&YaST create for you) are permanent until an update makes another dracut run necessary.

Dracut creates/overwrites one initrd for each installed kernel. Because openSUSE automagically keeps the »old« kernel around as a GRUB2 option to boot into (in case the »new« kernel doesn’t work), this usually means 2 dracut runs per kernel/driver/dracut update. That’s why I move the old kernel aside as soon as I am sure the new one works. It saves me time and my SSD the disk activity of one additional dracut run.

I usually have a peek with »ls -lart /boot/« to see how large and how recent the newest initrd is, and to judge differences pre/post updates or while experimenting. Also, »sudo lsinitrd« is nice, if for nothing else than to get an impression as to how much stuff is in those initial RAM disks: drivers, systemd, recscue system and so on.

In the beginnings, when I wasn’t yet sure about what I could omit, I supplied my parameters on the command line with each manual dracut run (sudo dracut --hostonly --no-compress --force --omit "img-lib cifs fcoe…), leading to awkwardly long and repetitive commands. I discovered that I could »make my dracut setup permanent« for every future dracut run within the above config file named »/etc/dracut.conf.d/01-my-own-dracut.conf«; with it, a brief »sudo dracut« suffices to build initrd according to my wishes each time. Another advantage of the config file: zypper/RPM post-install scripts and YaST modules that trigger dracut runs, they automatically honor my config file as well.

One quirky detail: It turned out that my system can load uncompressed (but therefor larger) initrd images from SSD quite a bit quicker than the time the kernel takes to decompress dozens of megabytes of bzipped initrd data. That’s why trying out the »–no-compress« option alone can be worthwile (the kernel »knows« how to handle such an initrd). However, what’s well hidden is the fact that this command-line option translates to the line »compress=“cat”« in my 01-my-own-dracut.conf. (I think »cat« refers to the age-old UNIX tradition of »concatenating« files to make uucp/tar archives, RPM/Debian packages and initrd images; while other environments like MacOS have been using another approach: make a filesystem, copy files into it, then pack and compress that filesystem.)

I don’t know yet how to do that; and for the person who does: the article would have to be garnished with just so many caveats, exceptions, preambles and warnings. Because some of those points I listed initially, though reversible, can be dangerous and lead to an incomplete, unbootable or (even worse) an insecure system, singlehandedly disabling AppArmor and firewalls left and right. Also, despite having worked in IT for about 30 years, I write from an enthusiast/hobbyist/amateur/experimentalist point of view. I don’t consider myself an expert on any of the topics mentioned; that becomes clear to me each time I talk with real experts. :slight_smile:

A short list of things I forgot to mention but could be useful too:

  • IF
    you are short on RAM and IF you’re using a swap partition, then playing with the »swappiness« and cache-pressure parameters could be worthwile. (Since RAM prices have been very customer-friendly recently, both my home desktop and my laptop now have 8GB of RAM, so I don’t use swap anymore; my last use vm.swappiness setting was 10.)
  • The act of disabling stuff at boot time doesn’t mean you can’t use it as soon as booting is finished an you’re logged in. For example, I still have audio support despite disabling it in the dracut config, because the kernel module does get loaded anyway after the early-boot phase, as soon as PulseAudio initializes. In order to really disable something like that, I would have to blacklist
    it like I did with Nouveau/Bluetooth/Wifi/IPv6/cdrom support. Services I keep disabled usually, I still can activate on-demand; sshd is one example that comes to mind.
  • programmers’ advice: avoid premature optimization. In the case of bren077s (OP), those plymouth-quit-wait/btrfsmaintenance/ntp/chrony-wait
    services may be the real culprit to tackle first.

Cheers!

Thank you! I really appreciate it.

To create wiki articles I always just run a search on the exact title I want to use, e.g. “SDB:Streamline the Boot Process”.
If the article does not exist, then the results page will say “Create the page “SDB:Streamline the Boot Process” on this wiki!”

https://en.opensuse.org/SDB:Howto

Clicking “Edit” on an existing article is probably the quickest way to learn the formatting.
https://en.opensuse.org/SDB:Disk_space