DNS doesn't resolve

Seems like a new thread is better, but I have the same problem as described in the LEAP 15.1 E.o.L DNS doesn’t resolve thread where I first added my experiences.

I did an upgrade of my father’s machine from 15.2 to 15.3 today and that failed halve-way because the upgrade could not find a certain file.

After a reboot a login screen came up but not possible to log in, even as root, the password was accepted but after 1 second the log in screen came back.
So back to the virtual console and I could log in as root and saw the system was already reporting it was 15.3. Tried a zypper up/dup but those fail on DNS.

Then did some debugging and did run into the same problem as the earlier mentioned thread, dig can resolve the address, but ping indicates it can not resolve any domain, and zypper has the same problem. I did check the servers, they could be pinged and using dig @<server-adress> did show that these servers were also working (even tried 9.9.9.9), but still not on the command line, dig works but ping does not resolve host

One more thing I tried is adding the entries to /etc/hosts but even that did not work.

I did check /etc/nsswitch.conf and it had (just like my running Tumbleweed system):

hosts: files mdns_minimal [NOTFOUND=return] dns

Having files as first I would have expected that static entries added to /etc/hosts would work, but they also did not resolve.
Also tried with “hosts: files dns” but also that did not work.

Does anybody has an idea with piece of software is responsible for resolving? I checked systemd but I did not see systemd-resolved running nor present but I do not know that is needed.

If “nscd” is running, then I think it does the resolving. Otherwise it is handled by libraries.

This weekend I did some further debugging and did run strace. Comparing it with a good run on a working install, I see the problem is with nscd not returning the information needed.

socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 5
connect(5, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = 0
sendto(5, "\2\0\0\0\16\0\0\0\6\0\0\0nu.nl\0", 18, MSG_NOSIGNAL, NULL, 0) = 18
poll({fd=5, events=POLLIN|POLLERR|POLLHUP}], 1, 5000) = 1 ({fd=5, revents=POLLIN}])
read(5, "\2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 24) = 24
close(5)                                = 0
write(2, "ping: nu.nl: Name or service not"..., 39) = 39
exit_group(2)                           = ?
+++ exited with 2 +++

Instead of debugging that I started the computer from a Leap 15.3 image on USB stick from some months ago. My plan was to do an new install but then I saw the option to do an upgrade and tried that. That worked out wonder-well only problem after the upgrade was that Thunderbird did not want to work with the old profile data. After doing two times “zypper up”, Leap was up-to-date and Thunderbird was also happy again with the old profile data.

So problem solved although I still would have liked to know why nscd was not working.

The Name Service Cache Daemon cache?


 # LANG=C nscd --help
  -i, --invalidate=TABLE     Invalidate the specified cache

Supported tables:
passwd group hosts services netgroup

The nscd statistics may help to indicate what’s going on – “ # nscd --statistics” – the “hosts cache:” is possibly what you could have inspected.

Looking at the output for " sudo nscd --statistics" on my own running Thumbleweed systeem I doubt if it would have helped to find the root cause, but for sure it is something that is good to check.

Likely the next question would have been why nscd could not resolve DNS, I guess my start for that would to kill the running nscd and start it from the command line with the --debug option so that it is running on the foreground and displays messages on the TTY. As it was not my own computer and somebody was waiting fot it to be working, I was happy that the upgrade was working.

Hi
Are you using NIS, NIS+ and LDAP? If not remove nscd, likely a red-herring. It’s not installed or used here…

If activated turn off nscd. Default configuration of DNS is the biggest annoyance to my experience: https://forums.opensuse.org/showthread.php/565929-Random-DNS-problem-on-Tumbleweed?p=3111967#post3111967

Yes, but, this is assuming that, the “ypbind” package is installed and, that the systemd “ypbind” service is enabled and running.

  • Possibly, this Red Hat bug report may shed some light on what’s going on – <449126 – nscd does not follow NIS server binding changes.
    A known “nscd” behaviour is, if the NIS domains are changed without restarting the Name Service Cache Daemon, the new entries will take quite some time – possibly hours – to appear in the cache …

Hi
But is the Original Poster :wink:

@marel:

I meant to point my comment to Malcom’s reply to you and, that’s what I’m now doing … :wink:

Neither my father’s machine nor mine is running anything fancy, only DNS and for sure not NIS, NIS+ or LDAP, I am running dnscrypt-proxy without problems.

Yes, I could kill nscd but things for me have always been working and the problem with my father’s machine was because of an update going bad which was solved after finishing the upgrade so I see no reason currently to kill it.

Yesterday I did have a look at the Anatomy of a Linux DNS Lookup series and concluded that things I thought were simple are in fact quite complex, I still wonder how much time it would have costed to find the root-cause of the problem.