I have three servers - one server (head node) has a 2-port infiniband card, and the other two slave nodes each have a single port card - these are directly connected with copper cables. (no switch or router) I want to use a single OpenMPI process across all three computers, but the two slave nodes are trying to talk to each other and cannot because according to a “best NIC practices” document I found, I was supposed to assign each NIC on the head node to a separate subnet, which I did.
So, is a fix to use bridge-utils to make a bridge on the head node between the two subnets (NIC’s)? I’m not sure this is the correct use of a bridge. Does anyone else have any recommendations?
No to your question.
The document you describe properly suggests that you configure a different NetworkID for each connection from your head node and each slave node. If a direct connection is made between your slave nodes for some reason, that should also have its own unique NetworkID. The reason for these is to ensure directed connections from one machine to another so there wouldn’t be any mis-directed packets. And, it’s probably not enough just to make each network configuration a “different subnet,” done incorrectly one network could be considered a subnet of another and that would cause problems. So, for instance the following would not satisfy your objectives…
The apparent problem is that communication has to happen between 192.168.1.1 <–> 192.168.0.2 (two different subnets).
This would mean a compute node talking through the head node to the other slave node, and vice versa.
*"For InfiniBand adapters with two ports, a second instance of the subnet manager must be
active to enable a subnet on the second port. To begin, enable the subnet manager as above:
/etc/init.d/opensmd start
**Next, discover the GUID of the second port:
ibstat –p
This command will output two numbers, one for each port. Use the second number
to start up a new OpenSM instance in daemon mode:
opensm –g <0xguid number> -B
There may also be an instance where the head node does not have InfiniBand hardware,
but the compute nodes do. In this case, provided a hardware subnet manager is not used,
one of the compute nodes must act as the subnet manager."
The article you reference should provide what you need…
And there are other articles which describe the same.
Basic concepts…
Infiniband Subnets are not the same as Ethernet subnetting… Whereas the latter is well known and has meaning for defining networks withing networks by logical addressing,
Infiniband subnets are calculated in a different way.
Because IB subnets are actually physical (not so much logical) connections, you should understand you can’t confuse the two.
Note the steps and requirements of your situation…
Although ti’s possible to have the OpenSM Managers on the compute (slave) nodes), more commonly it’s on the head node as you are configured.
Apparently, an OpenSM Manager has to run on each port…
So your reference describes how to start up one instance of OpenSM Manager, identify which port it’s managing and then you have to start up a second instance of OpenSM configured for the second port.
After the above, all three nodes should be able to communicate with each other.
IP addresses have little or nothing to do with what you have to configure.
I must admit I know little about InfinitiBand (and I’m not in an environment that uses/requires it), but this is an illuminating thread for anyone interested in it, so thanks for starting this thread PattiMichelle!
Thank you for taking the time to look over the reference. I’m very nervous about playing with TCP/IP. I have read a lot, including the Linux Network Admin. Guide, but it’s not sinking-in very fast - lots of key points still seem abstract. I will try to implement the how-to referenced software to see if I can get ib going on all my nodes.
PattiM:)
Thank you for taking the time to look over the reference. After all these years, I’m very nervous about playing with TCP/IP. I have read a lot, including the Linux Network Admin. Guide, but it’s not sinking-in very fast - lots of key points still seem rather abstract. I will try to implement the how-to referenced software to see if I can get OpenMPI across ib going on all my nodes. I’ll post updates.
One tidbit… you don’t need a switch or router (maybe that’s why they call it Open “Fabrics”?) - but node boot order is important for NFS (but not important for ssh) if a few machines are directly connected. I boot the [compute/peripheral/slave] nodes first and part-way through their boot, I boot the head node. Then after all are booted, I have to reboot the compute/etc. nodes so that NFS exports get mounted from the compute nodes to the head node (A directory tree containing software/data is NFS-exported to all compute nodes from the head node). I tried restarting NFS on the compute nodes, but the only way I’ve been successful at mounting the exports is with this particular boot order. (Maybe there’s a sysctl call that will do this on the compute nodes without a reboot?)
Another tidbit - Wicked is fine with IB, but Network Manager doesn’t appear to recognize IB. I fiddled with NM settings for the IB port, but got nowhere.