Obtaining IP Addresses with nsswitch

Hi all,

I have set up a KVM host with two guest vms. I want to connect from the host to the guests by name, but the name resolution does not seem to work.

I have followed section 12.1.1.5 Obtaining IP Addresses with nsswitch for NAT Networks (in KVM) in the Virtualization Guide:

  • install package libvirt-nss (with YaST, not with zypper)
  • Add libvirt to the hosts line in /etc/nsswitch.conf
  • restart nscd (I also rebooted, just in case)

But that does not seem to work. If I do on the host:

 # ping guest1
ping: guest1: Name or service not known

Both guests can ping each other by name. The host can ping both guests by IP.

nslookup seems to indicate that libvirt is not actually being used for name lookup:


# nslookup guest1
Server:         xxx.xxx.xxx.21
Address:        xxx.xxx.xxx.21#53

** server can't find guest1: NXDOMAIN

xxx.xxx.xxx.21 is the primary DNS configured in Network Settings for the host.

If I instruct nslookup to query the virtual DNS server explicitly, the name lookup succeeds:


# nslookup guest1 192.168.122.1
Server:         192.168.122.1
Address:        192.168.122.1#53

Name:   guest1
Address: 192.168.122.201

This corresponds to the information in the file /var/lib/libvirt/dnsmasq/virbr0.status :
“ip-address”: “192.168.122.201”,
“hostname”: “guest1”

“ip-address”: “192.168.122.202”,
“hostname”: “guest2”

The virtual network is the default virbr0 NAT-ed network created automatically by YaST.

The host network is configured with wicked, static IP, primary and secondary DNS, and the network interface is a bond of eth0 and eth1.

I’ve played a little with the setting of NETCONFIG_DNS_STATIC_SERVERS in the sysconfig editor, but to no avail.

Any ideas are much appreciated,
Uos

You’ve encountered or discovered a fundamental concept that applies to all virtualization technologies…
The HostOS and its Guests cannot discover its other’s name without external help.
And, this applies no matter what kind of network connection your Guest is using (bridging, NAT, Host-only)

As you’ve discovered on your own,
You need to set up a DNS server and populate a zone with names for the HostOS and Guest(s).

If you already have a DNS server set up for AD or LDAP, that’d be good enough.
But if you’re setting up a Workggroup, you’re probably not used to the idea of setting up a LAN DNS, but it’s necessary to ensure name resolution.

And a FYI…
Although you can set up nsswitch, it’s not required except in very rare scenarios.
I can’t even remember the last time I’ve set it up for one of my own networks, it’s so rare.
You should be able to do 99% of your virtual switching without nsswitch…

TSU

So show this line.

nslookup seems to indicate that libvirt is not actually being used for name lookup

nslookup does not use nsswitch at all, it is pure DNS application which does only DNS lookups.

Any ideas are much appreciated

Start with basic troubleshooting steps described in libvirt wiki.

hosts: files libvirt mdns_minimal [NOTFOUND=return] dns

Therefore I describe my problem as ping guest1 doesn’t work.
The output of nslookup might be handy information to determine what is going on.

I’ve looked through all the items in

but I fail to see my problem described.

I appreciate the effort, but can you please start with what the manual describes in section 12.1.1.5 Obtaining IP Addresses with nsswitch for NAT Networks (in KVM)](https://doc.opensuse.org/documentation/leap/virtualization/html/book.virt/cha.libvirt.networks.html#libvirt.networks.virtual.vmm.nsswitch).

It describes exactly what I need: “reach the guest system by name from the host.”

Either the instructions in that section work for you, which means the manual is correct and the problem is then my computer.
Or the instructions in that section don’t work for you either, which means the manual is incomplete.

Thanks for the help
Uos

What makes you think I did not? This manual gives you link to libvirt page which provides hints how to check for information from libvirt side, without using NSS module. You need to be sure libvirt actually sees this information before deciding where to look further.

I see what is happening in your case,
When you perform any name resolution on your HostOS, it’s using your normal networking setup which is to point to the nameserver listed in /etc/resolv.conf

But,
For all libvirt operations,
name resolution points to dnsmasq.
That is why your guests can see each other and should be able to see the HostOS (if the appropriate entry exists).

Question,
Can your guests resolv addresses not on your Host, eg the Internet?
That will tell you whether dnsmasq is configured to forward requests it can’t resolve to an external DNS server.
If that works, then maybe you can test pointing your HostOS DNS configuration to your dnsmasq as well.

TSU

You didn’t say whether you did or not, and the hints you are giving don’t solve my problem.

When I run virsh net-dhcp-leases default on the host I get:

2018-08-23 12:33:00   54:52:be:13:05   ipv4   192.168.122.200/24   guest1

I’ve turned off guest2.

Uos

ping google.com works from the guest.

Uos

Did you try the next steps I suggested if this works?

TSU

You skipped the second, most interesting command. At this point I would suggest - stop nscd and run “strace -f getent hosts guest1”. This may provide some clues to what happens. Do it without nscd to force getent itself to call NSS modules (otherwise we’ll see just calls to nscd).

I understood that it was equivalent to the first command. For completeness:

virsh domifaddr --source lease guest1

 Name       MAC address          Protocol     Address
-------------------------------------------------------------------------------
 vnet0      54:52:00:be:13:05    ipv4         192.168.122.200/24

With nscd stopped, the output of “strace -f getent hosts guest1” is quite long, but it ends in:


stat("/etc/resolv.conf", {st_mode=S_IFREG|0644, st_size=861, ...}) = 0
open("/etc/hosts", O_RDONLY|O_CLOEXEC)  = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=662, ...}) = 0
read(3, "#
# hosts         This file desc"..., 4096) = 662
read(3, "", 4096)                       = 0
close(3)                                = 0
open("/var/lib/libvirt/dnsmasq/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFDIR|0755, st_size=140, ...}) = 0
getdents(3, /* 7 entries */, 32768)     = 232
access("/var/lib/libvirt/dnsmasq//virbr0.macs", F_OK) = 0
openat(AT_FDCWD, "/var/lib/libvirt/dnsmasq//virbr0.macs", O_RDONLY) = 4
read(4, "
  {
    \"domain\": \"llvm\",
    "..., 8192) = 80
read(4, "", 8112)                       = 0
close(4)                                = 0
openat(AT_FDCWD, "/var/lib/libvirt/dnsmasq//virbr0.status", O_RDONLY) = 4
read(4, "
  {
    \"ip-address\": \"192.168"..., 8192) = 145
read(4, "", 8047)                       = 0
close(4)                                = 0
getdents(3, /* 0 entries */, 32768)     = 0
close(3)                                = 0
fstat(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 2), ...}) = 0
write(1, "192.168.122.200 guest1
", 23192.168.122.200 guest1
) = 23
exit_group(0)                           = ?
+++ exited with 0 +++

which seems to confirm that after files (/etc/hosts) indeed libvirt (/var/lib/libvirt/dnsmasq/virbr0.macs) is being used, and the correct IP address is extracted.

Uos

I have added 192.168.122.1 as the third DNS server in the Network Settings of YaST, but that doesn’t help.

Uos

If you have multiple DNS listed, you may have to make your DNSMASQ address first to avoid various undesirable scenarios.

TSU

So is it correct that without nscd correct IP address is resolved for guest? Does same getent command still return correct address if you start nscd again?

Yes, gentent resolves the correct IP address.

No. With nscd running, getent shows:


...<approx 100 lines>
socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ECONNREFUSED (Connection refused)
close(3)                                = 0
socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ECONNREFUSED (Connection refused)
close(3)                                = 0
open("/etc/nsswitch.conf", O_RDONLY|O_CLOEXEC) = 3
...<approx 150 lines>
+++ exited with 0 +++

Without nscd running, those first approx 100 lines are nearly identical (different memory addresses), but the end is different:


...<approx 100 lines>
socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = 0
sendto(3, "\2\0\0\0\r\0\0\0\6\0\0\0hosts\0", 18, MSG_NOSIGNAL, NULL, 0) = 18
poll({fd=3, events=POLLIN|POLLERR|POLLHUP}], 1, 5000) = 1 ({fd=3, revents=POLLIN}])
recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov={iov_base="hosts\0", iov_len=6}, {iov_base="\310O\3\0\0\0\0\0", iov_len=8}], msg_iovlen=2, msg_control={cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[4]}], msg_controllen=20, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 14
mmap(NULL, 217032, PROT_READ, MAP_SHARED, 4, 0) = 0x7fd55dbac000
close(4)                                = 0
close(3)                                = 0
socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = 0
sendto(3, "\2\0\0\0\5\0\0\0\7\0\0\0guest1\0", 19, MSG_NOSIGNAL, NULL, 0) = 19
poll({fd=3, events=POLLIN|POLLERR|POLLHUP}], 1, 5000) = 1 ({fd=3, revents=POLLIN|POLLHUP}])
read(3, "\2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\377\377\377\377\377\377\377\377\0\0\0\0\1\0\0\0", 32) = 32
close(3)                                = 0
stat("/etc/resolv.conf", {st_mode=S_IFREG|0644, st_size=861, ...}) = 0
socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = 0
sendto(3, "\2\0\0\0\4\0\0\0\7\0\0\0guest1\0", 19, MSG_NOSIGNAL, NULL, 0) = 19
poll({fd=3, events=POLLIN|POLLERR|POLLHUP}], 1, 5000) = 1 ({fd=3, revents=POLLIN|POLLHUP}])
read(3, "\2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\377\377\377\377\377\377\377\377\0\0\0\0\1\0\0\0", 32) = 32
close(3)                                = 0
exit_group(2)                           = ?
+++ exited with 2 +++

Kind regards,
Uos

With 192.168.122.1 as the first DNS server in Network Settings, ping guest1 works and ping google.com works too.

I’m not sure that setting the first DNS server to a virtual DNS server won’t create undesirable scenarios in itself, but for now, this seems to be the simplest workaround.

Thanks
Uos

Importance of the first DNS entry:

For any other DNS entry to be tried, the previous entries have to be tried, and fail. If the upstream DNS cannot be contacted at all (typically network problem), then failover can happen quickly. But, if the upstream DNS simply doesn’t have a cached result, then that DNS will query its own upstream DNS and this will be repeated until there is a timeout. With most common settings, this can take 2 minutes.

Of course, you’re not going to wait 2 minutes for each DNS entry to fail before your local dnsmasq entry is tried.

TSU