Issues with bond interfaces after reboot

Hey all, new here!

At work, we have SuperMicro servers running a Ceph cluster that we have been working on getting up and running, although this issue isn’t related specifically to Ceph but a more general networking question. Just asking whether anyone else have had these issues with bonding and getting the network up after a reboot.

The nodes were all configured with bond interfaces during install with YaST but after reboot, the nodes are having issues getting the networking back up. We have 7 OSD nodes, 3 MON and 4 GW nodes. All of the OSD’s get the bond interfaces defined and IP address set but never gets a default gateway route. The MON’s get the bond interfaces defined but never get an IP and subsequently no default gateway route, because of it. The GW’s get the bond interfaces defined and IP address set but no default route either.

Here’s the hardware in question:
MON’s
3x SuperServer 1019S-WR
3x Supermicro 2-Port SFP28 Broadcom 10/25GbE LAN card PCI-e LP
bond0 slaves; eth0- & 3, mode=802.3ad

GW’s
4x Supermicro 1018R-WC0R
8x Supermicro 2-Port SFP28 Broadcom 10/25GbE LAN card PCI-e LP
bond0 slaves; eth0- & 5, mode=802.3ad
bond1 slaves; eth3- & 4, mode=802.3ad
OSD’s
7x Supermicro 6049P-E1CR36L
14x Supermicro 2-Port SFP28 Broadcom 10/25GbE LAN card PCI-e LP
bond0 slaves; eth0- & 5, mode=802.3ad
bond1 slaves; eth3- & 4, mode=802.3ad

Thanks in advance!

Just wanted to let you know that I managed to solve the issue! Can’t say I fully understand why though but, meh! :dont-know:

I had defined tagged vlan interfaces as e.g. ‘ifcfg-vlanXXX’:

BOOTPROTO='static'
BROADCAST=''
ETHERDEVICE='bond0'
ETHTOOL_OPTIONS=''
IPADDR='XXX.XXX.XXX.XXX/XX'
MTU=''
NAME=''
NETWORK=''
REMOTE_IPADDR=''
STARTMODE='off'
VLAN_ID='XXX'

With the same IP addresses as the ‘access’-configured bonds. Why I never thought this to be disruptive is that after we decided to use access ports instead of tagged vlans, I made sure to mark these interfaces with “STARTMODE=‘off’”, but now, as soon as I remove these interface definitions, the default gateway route gets set as intended

I don’t know that a VLAN tag should have anything to do with setting a default gateway, I’m guessing that there might be only one other machine with the same VLAN tag with properly configured network settings?

The couple times I’ve ever configured bonded interfaces, I didn’t use YaST so don’t have direct experience with your situation but…
I do notice that you’re configuring static IP addresses.
Did you configure the routing tab in YaST to set your DG?
Alternatively, I’d expect that if you set your DG as an entry in the interface file, that should also work.

But relying on a VLAN tag…
That sounds really weird to me, and likely fragile if you don’t know exactly why it’s working for you.

IMO,
TSU