wpa_supplicant@.service fails on boot, succeeds on restart

Context: I switched from NetworkManager to systemd-networkd.

With NetworkManager it doesn’t fail, but takes about 25s to get conectivity. systemd-networkd is operation but there’s this situation on boot.

Most threads I found about these errors mention p2p-dev-*, which is not the case here.

On boot:

1605152282.100438: Successfully initialized wpa_supplicant
1605152282.114853: Could not read interface wlp3s0 flags: No such device
1605152282.114893: nl80211: Driver does not support authentication/association or connect commands
1605152282.114898: nl80211: deinit ifname=wlp3s0 disabled_11b_rates=0
1605152282.118068: Could not read interface wlp3s0 flags: No such device
1605152282.118096: wlp3s0: Failed to initialize driver interface

It works once I restart wpa_supplicant@wlp3s0 manually:


1605152298.583449: Successfully initialized wpa_supplicant
1605152303.578489: wlp3s0: SME: Trying to authenticate with 22:cb:5f:cb:0b:20 (SSID='SSID' freq=2412 MHz)
1605152303.627920: wlp3s0: Trying to associate with 22:cb:5f:cb:0b: (SSID='SSID' freq=2412 MHz)
1605152303.672058: wlp3s0: Associated with 22:cb:5f:cb:0b:
1605152303.672083: wlp3s0: CTRL-EVENT-SUBNET-STATUS-UPDATE status=0
1605152303.866410: wlp3s0: WPA: Key negotiation completed with 22:cb:5f:cb:0b: [PTK=CCMP GTK=CCMP]
1605152303.866461: wlp3s0: CTRL-EVENT-CONNECTED - Connection to 22:cb:5f:cb:0b: completed [id=0 id_str=]

Replacing the wlp3s0 instance with wlan0 gives different errors on boot (sorry, didn’t save).

To me it looks like it runs before udev renames this interface.

The wpa_supplicant.service is slightly different. Could be that with NM it waits longer for dbus.

$ diff /usr/lib/systemd/system/wpa_supplicant.service /usr/lib/systemd/system/wpa_supplicant@.service
< Description=WPA Supplicant daemon
< After=dbus.service
< Before=network-pre.target
< Wants=network-pre.target
---
> Description=WPA Supplicant daemon (interface %i)
> After=dbus.service network.target
< ExecStart=/usr/sbin/wpa_supplicant -c /etc/wpa_supplicant/wpa_supplicant.conf -u -t -f /var/log/wpa_supplicant.log
---
> ExecStart=/usr/sbin/wpa_supplicant -i%i -c /etc/wpa_supplicant/wpa_supplicant.conf -u -t -f /var/log/wpa_supplicant.log

On my machine wpa_supplicant needs restarting:

**3400G:~ #** journalctl -b -u wpa* -o short-monotonic  
-- Logs begin at Tue 2020-11-10 11:57:05 CET, end at Thu 2020-11-12 07:05:08 CET. -- 
    4.950633] 3400G systemd[1]: Starting WPA Supplicant daemon (interface wlp7s0)... 
    4.960669] 3400G systemd[1]: Started WPA Supplicant daemon (interface wlp7s0). 
    5.030936] 3400G systemd[1]: **wpa_supplicant@wlp7s0.service: Main process exited, code=exited, status=255/EXCEPTION** 
    5.031038] 3400G systemd[1]: **wpa_supplicant@wlp7s0.service: Failed with result 'exit-code'.** 
    6.036585] 3400G systemd[1]: wpa_supplicant@wlp7s0.service: Scheduled restart job, restart counter is at 1. 
    6.036732] 3400G systemd[1]: Stopped WPA Supplicant daemon (interface wlp7s0). 
    6.037561] 3400G systemd[1]: Starting WPA Supplicant daemon (interface wlp7s0)... 
    6.042874] 3400G systemd[1]: Started WPA Supplicant daemon (interface wlp7s0). 
**3400G:~ #** 

Modified the service accordingly:


3400G:~ # systemctl cat wpa_supplicant@.service 
# /usr/lib/systemd/system/wpa_supplicant@.service
[Unit]
Description=WPA Supplicant daemon (interface %i)
After=dbus.service network.target

[Service]
Type=dbus
BusName=fi.w1.wpa_supplicant1
ExecStart=/usr/sbin/wpa_supplicant -i%i -c /etc/wpa_supplicant/wpa_supplicant.conf -u -t -f /var/log/wpa_supplicant.log

[Install]
WantedBy=multi-user.target

# /etc/systemd/system/wpa_supplicant@.service.d/override.conf
[Service]
Restart=on-failure
RestartSec=1
3400G:~ # 

Thanks Karl. It means that the supplied service definition is at fault. I found Arch defines the template version in a way more suited for systemd-networkd. I added this to the template definition:


[Unit]
Requires=sys-subsystem-net-devices-%I.device
After=sys-subsystem-net-devices-%I.device

Arch also doesn’t use dbus for this template, but this change is enough to get it up and running. Will report a bug.

The *-wait service is also faster in this machine. 10s vs 25s. It gets online faster than Firefox starts up.

It definitely makes sense to start wpa_supplicant after the device is available:

**3400G:~ #** systemctl cat wpa_supplicant@wlan0.service                            
**# /etc/systemd/system/wpa_supplicant@.service**
[Unit] 
Description=WPA supplicant daemon (interface-specific version) 
Requires=sys-subsystem-net-devices-%i.device 
After=sys-subsystem-net-devices-%i.device 
Before=network.target 
Wants=network.target 

# NetworkManager users will probably want the dbus version instead. 

[Service] 
Type=simple 
#ExecStart=/usr/bin/wpa_supplicant -c/etc/wpa_supplicant/wpa_supplicant-%I.conf -i%I 
ExecStart=/usr/sbin/wpa_supplicant -i%i -c /etc/wpa_supplicant/wpa_supplicant.conf -u -t -f /var/log/wpa_supplicant.log 
[Install] 
Alias=multi-user.target.wants/wpa_supplicant@%i.service 
**3400G:~ #**


There is a slight delay, but it’s barely noticable:

**3400G:~ #** systemd-analyze critical-chain display-manager.service 
The time when unit became active or started is printed after the "@" character. 
The time the unit took to start is printed after the "+" character. 

**display-manager.service +597ms**
└─**apache2.service @1.353s +106ms**
  └─time-sync.target @1.347s 
    └─**chronyd.service @1.292s +53ms**
      └─network.target @1.290s 
        └─wpa_supplicant@wlan0.service @1.289s 
          └─sys-subsystem-net-devices-wlan0.device @1.287s 
**3400G:~ #**

Startup log:

**3400G:~ #** journalctl -b -u systemd-networkd.service -u systemd-resolved.service -u wpa* --no-full -o short-monotonic    
-- Logs begin at Thu 2020-11-12 05:16:03 CET, end at Thu 2020-11-12 17:13:10 CET. -- 
    4.873061] 3400G systemd[1]: Starting Network Service... 
    4.910069] 3400G systemd-networkd[531]: Enumeration completed 
    4.951252] 3400G systemd[1]: Started Network Service. 
    4.954716] 3400G systemd[1]: Starting Network Name Resolution... 
    5.076863] 3400G systemd-resolved[586]: Positive Trust Anchors: 
    5.086877] 3400G systemd-resolved[586]: . IN DS 20326 8 2 e06d44b80b8f1d39a95c0b0d7c65d08458e880409bbc683457…7f8ec8d 
    5.086947] 3400G systemd-resolved[586]: Negative trust anchors: 10.in-addr.arpa 16.172.in-addr.arpa 17.172.in-addr.… 
    5.164840] 3400G systemd-resolved[586]: Using system hostname '3400G'. 
    5.165516] 3400G systemd[1]: Started Network Name Resolution. 
    5.646143] 3400G systemd-networkd[531]: **wlan0: found matching network '/etc/systemd/network/wlan.network', b… ifname**
    5.646257] 3400G systemd-networkd[531]: wlan0: IPv6 successfully enabled 
    5.649013] 3400G systemd[1]: Started WPA supplicant daemon (interface-specific version). 
    5.664995] 3400G systemd-networkd[531]: wlan0: Link UP 
    6.793187] 3400G systemd-networkd[531]: wlan0: Gained carrier 
    6.793577] 3400G systemd-networkd[531]: wlan0: Connected WiFi access point: FRITZ!Box Karl Mistelberger (08:…:d6:d3) 
    6.793656] 3400G systemd-networkd[531]: **wlan0: found matching network '/etc/systemd/network/wlan.network', b… ifname**
    7.804713] 3400G systemd-networkd[531]: wlan0: DHCPv4 address 192.168.178.20/24 via 192.168.178.1 
    8.828476] 3400G systemd-networkd[531]: wlan0: Gained IPv6LL 
**3400G:~ #**

Good luck with USB WiFi dongle.

The current system has:

**3400G:~ #** systemctl list-units --type device |tail                                     
  sys-module-fuse.device                                                                                loaded active plugged /sys/module/fuse                                                             
  **sys-subsystem-net-devices-eth0.device**                                                                 loaded active plugged RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (Onboard Ethernet) 
  **sys-subsystem-net-devices-wlan0.device**                                                                loaded active plugged AR9287 Wireless Network Adapter (PCI-Express)                                

LOAD   = Reflects whether the unit definition was properly loaded. 
ACTIVE = The high-level unit activation state, i.e. generalization of SUB. 
SUB    = The low-level unit activation state, values depend on unit type. 

62 loaded units listed. Pass --all to see loaded but inactive units, too. 
To show all installed unit files use 'systemctl list-unit-files'. 
**3400G:~ #**


Adding a USB WiFi dongle will cause udev to load and activate another device. Thus I presume I need not rely on luck.:\

I tried to improve: Network Management With Systemd - openSUSE Wiki

**3400G:~ #** systemd-analyze critical-chain network.target 
The time when unit became active or started is printed after the "@" character. 
The time the unit took to start is printed after the "+" character. 

network.target @749ms 
└─**systemd-resolved.service @578ms +170ms**
  └─**systemd-networkd.service @524ms +51ms**
    └─**systemd-udevd.service @355ms +167ms**
      └─**systemd-tmpfiles-setup-dev.service @330ms +9ms**
        └─**kmod-static-nodes.service @299ms +21ms**
          └─systemd-journald.socket 
            └─-.mount 
              └─system.slice 
                └─-.slice 
**3400G:~ #**