sssd race condition at startup?

I installed 42.2 in a virtual machine (virtual box).
I have configured sssd and pam using the configuration found on my old openSUSE box, running 13.1. As far as I can see, the configuration is identical.
I am able to login fine on my box running 13.1, but, using the same config, I cannot login in my new virtual machine running 42.2.
This is what I can see in the logs:

Apr 28 13:04:34 linux-rqzp.suse sddm-helper[1467]: [PAM] Starting…
Apr 28 13:04:34 linux-rqzp.suse sddm-helper[1467]: [PAM] Authenticating…
Apr 28 13:04:34 linux-rqzp.suse sddm-helper[1467]: [PAM] Preparing to converse…
Apr 28 13:04:34 linux-rqzp.suse sddm-helper[1467]: [PAM] Conversation with 1 messages
Apr 28 13:04:34 linux-rqzp.suse sddm-helper[1467]: pam_unix(sddm:auth): authentication failure; logname= uid=0 euid=0 tty= ruser= rhost= user=kenneth
Apr 28 13:04:34 linux-rqzp.suse sddm-helper[1467]: pam_sss(sddm:auth): authentication failure; logname= uid=0 euid=0 tty= ruser= rhost= user=kenneth
Apr 28 13:04:34 linux-rqzp.suse sddm-helper[1467]: pam_sss(sddm:auth): received for user kenneth: 9 (Authentication service cannot retrieve authentication info)
Apr 28 13:04:35 linux-rqzp.suse sddm-helper[1467]: [PAM] authenticate: Authentication service cannot retrieve authentication info
Apr 28 13:04:35 linux-rqzp.suse sddm-helper[1467]: [PAM] returning.
Apr 28 13:04:35 linux-rqzp.suse sddm-helper[1467]: [PAM] Ended.
Apr 28 13:04:35 linux-rqzp.suse sddm[1216]: Authentication error: “Authentication service cannot retrieve authentication info”
Apr 28 13:04:35 linux-rqzp.suse sddm[1216]: Auth: sddm-helper exited with 1
Apr 28 13:04:35 linux-rqzp.suse sddm-greeter[1245]: Message received from daemon: LoginFailed

However, if I login as root (which is a local user) and restart sssd, I see this (when trying to login as kenneth again):

Apr 28 13:04:53 linux-rqzp.suse sddm-helper[1497]: [PAM] Starting…
Apr 28 13:04:53 linux-rqzp.suse sddm-helper[1497]: [PAM] Authenticating…
Apr 28 13:04:53 linux-rqzp.suse sddm-helper[1497]: [PAM] Preparing to converse…
Apr 28 13:04:53 linux-rqzp.suse sddm-helper[1497]: [PAM] Conversation with 1 messages
Apr 28 13:04:53 linux-rqzp.suse sddm-helper[1497]: pam_unix(sddm:auth): authentication failure; logname= uid=0 euid=0 tty= ruser= rhost= user=kenneth
Apr 28 13:04:53 linux-rqzp.suse sssd_be[1493]: GSSAPI client step 1
Apr 28 13:04:53 linux-rqzp.suse sssd_be[1493]: GSSAPI client step 1
Apr 28 13:04:53 linux-rqzp.suse sssd_be[1493]: GSSAPI client step 1
Apr 28 13:04:53 linux-rqzp.suse sssd_be[1493]: GSSAPI client step 2
Apr 28 13:04:54 linux-rqzp.suse sddm-helper[1497]: pam_sss(sddm:auth): authentication success; logname= uid=0 euid=0 tty= ruser= rhost= user=kenneth
Apr 28 13:04:54 linux-rqzp.suse sddm-helper[1497]: [PAM] returning.
Apr 28 13:04:54 linux-rqzp.suse sddm[1216]: Authenticated successfully
Apr 28 13:04:54 linux-rqzp.suse sddm-greeter[1245]: Message received from daemon: LoginSucceeded

This is strange and I suspect some kind of race condition happening at startup of the system. Any bright ideas?

Sorry, this is probably not clear in my post, but:

  • when trying to login with kenneth, after a fresh boot -> authentication failed
  • when login in as root, restart sssd en try to login as kenneth again -> authentication succeeds

So something goes wrong when sssd is started in the boot process :\

Did some more investigation … these are the sssd processes, right after boot:

linux-k9sb:~ # ps -ef | grep sssd
root 808 1 6 09:56 ? 00:00:01 /usr/sbin/sssd -D -f
root 814 808 0 09:56 ? 00:00:00 /usr/lib/sssd/sssd_be --domain default --uid 0 --gid 0 --debug-to-files
root 844 808 0 09:56 ? 00:00:00 /usr/lib/sssd/sssd_nss --uid 0 --gid 0 --debug-to-files
root 845 808 0 09:56 ? 00:00:00 /usr/lib/sssd/sssd_pam --uid 0 --gid 0 --debug-to-files
root 1490 814 0 09:56 ? 00:00:00 /usr/lib/sssd/sssd_be --domain default --uid 0 --gid 0 --debug-to-files

When I kill the last one, so the one where the parent is also sssd_be (PID 814), login works.

Found a solution. After digging in the log files I came across this thread:
https://lists.fedorahosted.org/archives/list/sssd-users@lists.fedorahosted.org/thread/EBK72BTHVSE3BQSJHHPQ5YONNNL4RGRM/

referring to this:
https://pagure.io/SSSD/sssd/issue/3016

If adcli is not installed on the system, we fork sssd_be during the password renewal task, but then fail to execute adcli (because it’s not there) and the new sssd_be process stays around.

Which was exactly what I saw. A forked sssd_be process that kept hanging. Installing adcli, although not available seems to solve my problem. You can get adcli from here:
https://koji.fedoraproject.org/koji/buildinfo?buildID=840336

kennywest,
I haven’t encountered this issue on any SLES or openSUSE system yet, but I would like to try and reproduce it. posting your sssd.conf and daemon version would be most helpful if you wouldn’t mind …

– lawrence