PXE boot of an ISO live distrbution

Hi all,

I have setup all the bits necessary to boot a diskless system using PXE - dhcp server in the router points to TFTP server on a Tumbleweed system. I can boot the Tumbleweed installer fine. I can also boot small ISOs for stand alone CDs like gparted using memdisk. Here is my pxelinux.cfg/default

default menu.c32
prompt 0
menu title PXE Boot Menu
    label localboot
      menu label Boot Local Disk
      localboot 0


    label install
      menu label Install openSUSE Tumbleweed
      kernel linux64
      append initrd=initrd64 splash=silent vga=0x314 showopts install=http://download.opensuse.org/factory/repo/oss/


    label gparted
      menu label Gparted ISO
      Kernel memdisk
      append iso initrd=gparted.iso raw

However what I want to be able to do is boot into a live linux session just as I would have as if I had booted from a CD or USB stick. Booting it as an ISO (openSUSE-Tumbleweed-XFCE-Live-x86_64-Current.iso) like I can do for gparted works up to a point and then fails and enters Emergency Mode. It’s also not a nice solution.

Clearly I can extract the kernel and initrd from the ISO but not sure what other parameters are needed on the append line.

Anyone got any experience or ideas of this?

kiwi-live needs real block device and syslinux memdisk can only be accessed via BIOS interfaces, which are not available when kernel is loaded.

Looking at kiwi-live dracut module, it apparently supports AOE (ATA Over Ethernet); you would need to export ISO image via ATA, pass additional rd.kiwi.live.pxe option and change [noparse]root=live:CDLABEL:xxx[/noparse] to [noparse]root=live:AOENTERFACE:yyy[/noparse]:

# live images are specified with
# root=live:CDLABEL=label
# root=live:AOEINTERFACE=name

where name in this case is how AOE target is identified under /dev/etherd on initiator.

… and all of this is actually documented in https://github.com/OSInside/kiwi/blob/master/doc/source/working_with_images/network_live_iso_boot.rst … I’d be interested in your experience :slight_smile:

OK, surprisingly good progress very quickly.

Installing vblade was a bit of a battle as the one click instal didn’t work but I have it installed. I did:

ProgressTMX:/srv/tftpboot/tumbleweed # vbladed 0 1 enp0s3 tumbleweed.iso

and can mount it read only on another system using:


Cumulus:~ # modprobe aoe
Cumulus:~ # mount /dev/etherd/e0.1 /mnt/a/
mount: /mnt/a: WARNING: device write-protected, mounted read-only.
Cumulus:~ # ls -al /mnt/a/
total 10
drwxr-xr-x 1 root root 2048 Apr 24 02:27 .
drwxr-xr-x 9 root root 4096 Mar 23 09:14 ..
drwxr-xr-x 1 root root 2048 Apr 24 02:27 LiveOS
drwxr-xr-x 1 root root 2048 Apr 24 01:41 boot
Cumulus:~ # 

So that’s all seems to be working.

My … pxelinux.cfg/default has:

    label run_tw
      menu label Run openSUSE Tumbleweed
      kernel tumbleweed/linux
      append initrd=tumbleweed/initrd rd.kiwi.live.pxe root=live:AOEINTERFACE=e0.1

However the booting system appears to get off to a good start and then hangs with a line:

  19.789...] sd 0:0:0:0: [sda] Synchronizing SCSI case

… I’ll keep going and see where I get to …

Ok, this what I am getting. Looks like it all falls over with a network issue (OCR of a screen shot so may not be perfect!)

 16.045106] e1000 0000:00:03.0 lano: renamed from etho
 16.086860] input: Virtual Box USB Tablet as devices/pci0000:00/0000:00:06.0/usb2/2-1/2-1:1.0/000 3:80EE :0021.0001/input/input?
 16.094646] 80219: 802.1Q VLAN Support v1.8
 16.1183321 hid-generic 0003:80EE: 0021.0001: input, hidrawo: USB HID v1.10 Mouse (VirtualBox USB T ablet] on usb-0000:00:06.0-1/inputo
[OK ] Finished Wait for udev To Complete Device Initialization.
[OK ] Reached target System Initialization.
[OK ] Reached target Basic System.
 16.362540) dracut-initqueue [611]: /lib/kiwi-live-lib.sh: line 75: /tmp/net.bootdeu: No such file or directory
 16.429139] dracut: FATAL: Network setup failed, see tmp/net.info
 16.431714] dracut: Refusing to continue,
 16.551777] systemd-shutdown[1]: Syncing filesystems and block devices.
 16.555219] systemd-shutdown[1]: Sending SIGTERM to remaining processes..
 16.569168] sustemd-journald[152]: Received SIGTERM from PID 1 (systemd-shutdow).
 16.624959) systemd-shutdown[1]: Sending SIGKILL to remaining processes...
 16.633942] systemd-shutdown[1]: Unmounting file systems.
 16.637800] [625): Remounting '/var/lib/nfs/rpc_pipefs' read-only in with options (null)'.
[ 16.641937] [626): Unmounting 'var/lib/nfs/rpc_pipef's'.
[ 16.645638] [627]: Unmounting '/run/overlay'.
 16.649304] [628]: Remounting / read-only in with options (null)'. 
 16.652673) systemd-shutdown[1]: All filesystems unmounted.
 16.655356] systemd-shutdown[1]: Deactivating swaps.
 16.658105] systemd-shutdown[1]: All swaps deactivated.

Could be

https://bugzilla.opensuse.org/show_bug.cgi?id=1182227

Getting /tmp/net.info would certainly be useful (if you add rd.shell option dracut should spawn interactive shell instead of rebooting).

Indeed that option worked and I got a minimal shell.

/tmp/net.info is:

/lib/kiwi-lib.sh: line 78: ifup: command not found

and this is the code from /lib/kiwi-lib.sh starting at line 70 (again OCRed so may be minor errors):

if getargbool 0 rd.kiwi.live.pxe; then
    local bootdev
    read -r bootdeu < /tmp/net.bootdeu 2>/dev/null
    if  -n "${bootdeu}" ] &&  -e /tmp/net."${bootdeu}".did-setup ]; then
        : # already set up
    elif ! ifup lano &>/tmp/net.info;then
        die "Network setup failed, see tmp/net.info"
    fi
    modprobe aoe
    isodey="${1#live:aoe:}"
else
    isodeu="$1"
fi
local isodisk

Not sure if it is the same bug but looks like it is related.

Yes, that is what I expected.

Not sure if it is the same bug

No, it is different. Today there are three networking backends (module) in dracut - network-legacy, nework-wicked and network-networkmanager. Apparently dracut prefers network-wicked if it detects wicked, so initrd in Live image comes with network-wicked backend. The “ifup” command is provided only by network-legacy backend.

So either kiwi-live must be modified to support arbitrary backend or dracut must be modified to offer common tools to request and check for interface configuration.

May be kiwi-live is over engineered; basically just load aoe and check that requested device is available. Leave network configuration to whatever backend is present.

You should file an issue on github for kiwi.

Hi arvidjaar,

The path you suggested involved ATA Over Ethernet to present a block device to kiwi-live. As we see this fails to bring up the network. Is there a possibility of using NBD instead or is that simply going to substitute a different mechanism for the transport of the block device to then fall over in the same way as using AOE?

Thanks.

Hi arvidjaar,

OK, I have downloaded your ISO but having a few problems.

From the PXE/TFTP server I have exported the ISO with the following:

vbladed 0 1 enp0s3 /srv/tftpboot/tumbleweed/tumbleweed.iso

I can see it on my main server:

Cumulus:~ # ls -al /dev/etherd/total 0
drwxr-xr-x  2 root root    200 May 13 07:54 .
drwxr-xr-x 19 root root   4500 May 13 07:52 ..
c-w--w----  1 root disk 152, 3 May 13 07:52 discover
brw-rw----  1 root disk 152, 0 May 13 07:54 e0.1
brw-rw----  1 root disk 152, 1 May 13 07:54 e0.1p1
brw-rw----  1 root disk 152, 2 May 13 07:54 e0.1p2
cr--r-----  1 root disk 152, 2 May 13 07:52 err
c-w--w----  1 root disk 152, 6 May 13 07:52 flush
c-w--w----  1 root disk 152, 4 May 13 07:52 interfaces
c-w--w----  1 root disk 152, 5 May 13 07:52 revalidate

However my VM which is set to network boot falls over reporting that it can’t mount the ISO

   13.626282] localhost dracut-initqueue[634]: kiwi_type='iso'   13.626282] localhost dracut-initqueue[634]: kiwi_vga=''
   13.626282] localhost dracut-initqueue[634]: kiwi_wwid_wait_timeout=''
   13.714828] localhost kernel: squashfs: version 4.0 (2009/01/31) Phillip Lougher
   13.732216] localhost kernel: aoe: AoE v85 initialised.
   13.721042] localhost dracut-initqueue[670]: mount: /run/overlay/live: unknown filesystem type ''.
   13.797602] localhost dracut: FATAL: Failed to mount live ISO device
   13.799883] localhost dracut: Refusing to continue
   13.732249] localhost dracut-initqueue[664]: Warning:
   13.753627] localhost systemd[1]: Starting Dracut Emergency Shell...
   13.782340] localhost systemd[1]: Received SIGRTMIN+21 from PID 589 (plymouthd).
   13.797560] localhost systemd[1]: Received SIGRTMIN+21 from PID 589 (n/a).

When I drop into the maintenance shell I don’t see the drive listed is /dev/etherd on the VM. All I see is:

c-w--w----  1 root disk 152, 3 May 13 07:31 discover
cr--r-----  1 root disk 152, 2 May 13 07:31 err
c-w--w----  1 root disk 152, 6 May 13 07:31 flush
c-w--w----  1 root disk 152, 4 May 13 07:31 interfaces
c-w--w----  1 root disk 152, 5 May 13 07:31 revalidate

I took the initrd and linux files from \boot\x86_64\loader in your ISO.

Any suggestions?

Is network up in dracut?

Sorry, should have thought of that. No!

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: lan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 08:00:27:ff:47:62 brd ff:ff:ff:ff:ff:ff

As I already replied on GH, you are likely hitting 1182227 – combustion [# combustion: network] does not work. Try adding “ip=lan0:dhcp” to kernel command line.

Long pause in the boot sequence while it starts wicked services for ipV4 and ipV6 are started. Then drops into the emergency shell again. Network comes up just as the system stops. Then 10 seconds or so later aoe devices become available. All feels like a timing issue with the network initializing and that things are just not ready when the attempt is made to mount the iso.

Later - Scanning the log I noticed lots of lines about the network adapter just before it tried to mount the ISO so took a punt and simply changed the Virtualized Network card in VirtualBox (VirtualBox 6.1.2 running on a Win10 host) from “Intel PRO/1000 MT Desktop (82540EM)” to “PCnet-PCI II (Am79C970A)” and it boots into a full GUI!

Happy to run any further tests that anyone wants.

Tested all the different virtualized network interfaces as follows:

Intel PRO/1000 MT Desktop (82540EM)
Intel PRO/1000 T Server (82543GC)
Intel PRO/1000 MT Server (82545EM

All fail in the same way, with the network coming up just after the mount fails.

AMD PCNet FAST III (Am79C973)

Also fails but earlier on in the boot.

Paravirtualized network adapter

Network boot does not even recognise it!

AMD PCNet PCI II (Am79C970A)

Works.

I also have logs of both the working and failing boot sequences if anyone wants them.

It worths bug report? (for VirtualBox?)