Samba AD DC won't start - Looks like KDC is crashing.

On or about Mar. 1, there appears to have been an update to Samba 4 and/or it’s required Kerberos packages. Unfortunately, since then my domain controllers won’t come up. It looks like the KDC is choking on something. Below are the log message I’m getting in /var/log/warn when I “systemctl restart samba-ad-dc”.

The CONNECTION_REFUSED appears to be related to the DCE endpoint mapper service, but I’d guess if the KDC doesn’t come up properly, a lot of other stuff won’t either. Oddly enough, the named/bind_dlz combination is working. But I can’t authenticate to the domain any more. (obviously - no kdc)

There is also a message about DEPRECATED enctype that appeared starting Mar. 1.

I have no idea myself what’s going on, though.

Anyone have any ideas?

Thanks!

Mar  4 19:08:14 dc1 samba[2273]: [2020/03/04 19:08:14.148563,  0] ../../source4/smbd/server.c:622(binary_smbd_main)
Mar  4 19:08:14 dc1 samba[2273]:   samba version 4.11.5-git.114.5685848b8fcSUSE-oS15.5-x86_64 started.
Mar  4 19:08:14 dc1 samba[2273]:   Copyright Andrew Tridgell and the Samba Team 1992-2019
Mar  4 19:08:14 dc1 samba[2277]: [2020/03/04 19:08:14.245318,  0] ../../source4/smbd/server.c:865(binary_smbd_main)
Mar  4 19:08:14 dc1 samba[2277]:   binary_smbd_main: samba: using 'prefork' process model
Mar  4 19:08:14 dc1 samba[2277]: [2020/03/04 19:08:14.303237,  0] ../../lib/util/become_daemon.c:135(daemon_ready)
Mar  4 19:08:14 dc1 samba[2277]:   daemon_ready: daemon 'samba' finished starting up and ready to serve connections
Mar  4 19:08:14 dc1 mitkdc[2301]: [2020/03/04 19:08:14.431740,  0] ../../lib/util/util_runcmd.c:352(samba_runcmd_io_handler)
Mar  4 19:08:14 dc1 mitkdc[2301]:   /usr/lib/mit/sbin/krb5kdc: Stash file (null) uses DEPRECATED enctype !
Mar  4 19:08:14 dc1 mitkdc[2301]: [2020/03/04 19:08:14.433711,  0] ../../lib/util/util_runcmd.c:352(samba_runcmd_io_handler)
Mar  4 19:08:14 dc1 mitkdc[2301]:   /usr/lib/mit/sbin/krb5kdc: krb5kdc: starting...
Mar  4 19:08:14 dc1 smbd[2285]: [2020/03/04 19:08:14.467339,  0] ../../lib/util/become_daemon.c:135(daemon_ready)
Mar  4 19:08:14 dc1 smbd[2285]:   daemon_ready: daemon 'smbd' finished starting up and ready to serve connections
Mar  4 19:08:14 dc1 winbindd[2315]: [2020/03/04 19:08:14.544070,  0] ../../source3/winbindd/winbindd_cache.c:3164(initialize_winbindd_cache)
Mar  4 19:08:14 dc1 winbindd[2315]:   initialize_winbindd_cache: clearing cache and re-creating with version number 2
Mar  4 19:08:14 dc1 winbindd[2315]: [2020/03/04 19:08:14.546314,  0] ../../lib/util/become_daemon.c:135(daemon_ready)
Mar  4 19:08:14 dc1 winbindd[2315]:   daemon_ready: daemon 'winbindd' finished starting up and ready to serve connections
Mar  4 19:08:19 dc1 samba[2304]: [2020/03/04 19:08:19.305222,  0] ../../source4/librpc/rpc/dcerpc_sock.c:61(continue_socket_connect)
Mar  4 19:08:19 dc1 samba[2304]:   Failed to connect host 10.10.20.10 on port 135 - NT_STATUS_CONNECTION_REFUSED
Mar  4 19:08:19 dc1 samba[2304]: [2020/03/04 19:08:19.305330,  0] ../../source4/librpc/rpc/dcerpc_sock.c:243(continue_ip_open_socket)
Mar  4 19:08:19 dc1 samba[2304]:   Failed to connect host 10.10.20.10 (3335565d-7b53-4023-84ee-9d3cb83dbfce._msdcs.home.vmnet.us) on port 135 - NT_STATUS_CONNECTION_REFUSED.
( Repeated messages omitted... )
Mar  4 19:08:37 dc1 mitkdc[2301]: [2020/03/04 19:08:37.934585,  0] ../../source4/kdc/kdc-service-mit.c:344(mitkdc_server_done)
Mar  4 19:08:37 dc1 mitkdc[2301]:   The MIT KDC daemon died with exit status 6
Mar  4 19:08:37 dc1 mitkdc[2301]: [2020/03/04 19:08:37.934654,  0] ../../source4/smbd/service_task.c:36(task_server_terminate)
Mar  4 19:08:37 dc1 mitkdc[2301]:   task_server_terminate: task_server_terminate: [mitkdc child process exited]
Mar  4 19:08:37 dc1 samba[2277]: [2020/03/04 19:08:37.935377,  0] ../../source4/smbd/server.c:370(samba_terminate)
Mar  4 19:08:37 dc1 samba[2277]:   samba_terminate: samba_terminate of samba 2277: mitkdc child process exited
Mar  4 19:08:37 dc1 systemd[1]: samba-ad-dc.service: Failed with result 'exit-code'.
Mar  4 19:08:38 dc1 systemd-coredump[2386]: Process 2307 (krb5kdc) of user 0 dumped core.


Stack trace of thread 2307:
#0  0x00007fe501730ea1 raise (libc.so.6 + 0x3bea1)
#1  0x00007fe50171a53d abort (libc.so.6 + 0x2553d)
#2  0x00007fe500b93392 n/a (libtalloc.so.2 + 0x3392)
#3  0x00007fe500b93805 n/a (libtalloc.so.2 + 0x3805)
#4  0x00007fe500e8a0ec mit_samba_get_pac (samba.so + 0x60ec)
#5  0x00007fe500e8baa5 kdb_samba_db_sign_auth_data (samba.so + 0x7aa5)
#6  0x000055ef3be7eee2 n/a (krb5kdc + 0x10ee2)
#7  0x000055ef3be89e31 n/a (krb5kdc + 0x1be31)
#8  0x000055ef3be7c9e8 n/a (krb5kdc + 0xe9e8)
#9  0x000055ef3be7bc02 n/a (krb5kdc + 0xdc02)
#10 0x000055ef3be8141d n/a (krb5kdc + 0x1341d)
#11 0x000055ef3be8b385 n/a (krb5kdc + 0x1d385)
#12 0x000055ef3be79ce1 n/a (krb5kdc + 0xbce1)
#13 0x00007fe5015a5338 verto_fire (libverto.so.1 + 0x3338)
#14 0x00007fe4fb66c386 ev_invoke_pending (libev.so.4 + 0x5386)
#15 0x00007fe4fb6718ff ev_run (libev.so.4 + 0xa8ff)
#16 0x000055ef3be77d8b n/a (krb5kdc + 0x9d8b)
#17 0x00007fe50171bceb __libc_start_main (libc.so.6 + 0x26ceb)
#18 0x000055ef3be7826a n/a (krb5kdc + 0xa26a)



When a connection is actively refused, that’s usually an indication of a firewall blocking… If the connection failed simply because a service wasn’t running, you wouldn’t bet a “connection refused” error, you’d likely see something along the line of “no answer” or “connection timed out”

Test to make sure port 135 in the proper zone is open.

TSU

My firewall is currently wide open on my domain controllers. They were working prior to Mar. 1, and I haven’t touched anything in their configuration in a while.

Looking at my update logs, on Mar. 1, ‘zypper dup’ brought in a new set of both krb5 and samba (incl. samba-ad-dc) packages. Exact same configuration as I had before the update, but now it doesn’t work.

As far as ‘Connection Refused’, that is what Unix/Linux gives you when nothing is listening on the destination port. Firewalls normally silently drop packets, causing connections to timeout. The ‘Connection Refused’ on port 135 here is most likely because Samba is not getting to the point of bringing up the endpoint mapper before failing. I worked on DCE a LOOOOONG time ago (1994/1995) and IIRC, there are a LOT of circular dependencies. If Kerberos is not available, nothing else usually comes up, including things like the endpoint mapper.

The stack trace indicates it’s failing in a talloc call, which from what I can tell is some sort of memory allocator. That sort of has the smell of a bug to it.

This is what I get for living on the bleeding edge and auto-updating all of my VMs nightly. :slight_smile: Fortunately it’s just my home network, and my main working machine is a Tumbleweed box, so I’m not dead in the water. Just hoping there will be an update that gets things back in order. Bind_dlz and libdb2 have caused me pain in the recent past as well. So many moving parts. :slight_smile:

I submitted a bug: https://bugzilla.opensuse.org/show_bug.cgi?id=1165766