fastest "clustering" for 2 computers - crossover cable?

I want to double my computing power for some CFD calculations I’m doing which use MPICH2 and/or OpenMPI. I have two Phenom II black x6 computers and can link them by GigE - I was thinking that a crossover cable may be fastest since there’s no router chip to take up message passing time. (I don’t have a lot of money to spend or I might use infiniband.) I think you have to do something with ARP (???) to do a crossover, and since these are not servers, I would have to buy a separate PCIe ethernet card so I could leave the head connected to the internet.

What is the recommendation? Is this more headache than it’s worth? Would a cheap GigE router (or switch???) be simpler and just as fast? Now I see people doing direct PCIe interconnect for clusters, but don’t know if that’s cheaply viable. It would be nice to be able to avoid the TCP/IP stack altogether and have them appear to be one big NUMA machine…

THANK YOU!!!
Patricia :slight_smile:

In my opinion, a cheap router will be simpler, and probably just as fast.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

> I want to double my computing power for some CFD calculations I’m
> doing which use MPICH2 and/or OpenMPI.

Before going into this too much, is your current constraint the network
somehow? Certainly there are calculations which take a lot of bandwidth
between nodes doing calculations, but many distributed computing
projects instead have relatively small bandwidth constraints on
performance. I assume you’re fairly sure that saving (literally)
microseconds is going to improve performance, at least enough to offset
whatever costs are involved for the second NIC.

> I have two Phenom II black x6 computers and can link them by GigE - I
> was thinking that a crossover cable may be fastest since there’s no
> router chip to take up message passing time. (I don’t have a lot of
> money to spend or I might use infiniband.) I think you have to do
> something with ARP (???)

Pretty sure this is not the case. ARP works from one node to another;
servers, hubs, and switches relay traffic, but in the end an ARP request
from one machine goes out to the network (broadcast) and then the
machine with the right address replies providing the original requesting
machine the MAC address for further communication. With a two-node
network, using a crossover cable or not, this is the same, still
necessary, and still going to work without outside (of the two systems
involved) help.

> to do a crossover, and since these are not servers, I would have to
> buy a separate PCIe ethernet card so I could leave the head connected
> to the internet.
>
> What is the recommendation? Is this more headache than it’s worth?
> Would a cheap GigE router (or switch???) be simpler and just as
> fast?

Again, see my response above. Is this even a constraint on your
calculations? If you were, for example, doing distributed computing in
a way like SETI, World Community Grid, or something like that with your
project (bundles of data sent for processing, hours spent processing,
then a response sent back) the benefits of crossover gigabit vs.
gig-switch gigabit are going to be a waste of time/money. If, on the
other hand, your systems are exchanging hundreds of MB/MiBs per second
then a saved microsecond per packet may be valuable. Of course, so
would doing other TCP-level tuning, and probably to a much greater extent.

Anyway, just be sure you’re going after the current constraint;
otherwise the savings won’t be worth it no matter how many nanoseconds
you shave off.

Good luck.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/

iQIcBAEBAgAGBQJQS5lHAAoJEF+XTK08PnB5PvIP/iG8xjHwufEokSHjgl696sMz
WMmgOD8uWwmyVO66M5yhA0hxKhmcyKQsi1IeVLb+W+KnPSdgF2n/VxWDGHMpXuPG
DUpLsvdHWG3h0wyjbZqXKLMa9B7rEPqVYeuYELLM7IMpMvbGDNj7kCszjfJgmkd3
DEPqmvUJpjO5WE9vhu/3mNwT1PB0Qj63HZIpSI2MI0917ziHkPFZEBt8S6z9/H6o
76tjlqg5TgSbuEs45klRmXiF0v9e7Dx2R5RD9Cz3vN0C6atGltr0qCBN5hkWiHcL
4E5aQ+E9960IdYGkO0qSUi5B79gXmAU7K7SBGyQRlRsotRF9kHP1cG3g73A137Vy
kzwVPtfP2+ajwPOZ57+LLlHVGmWmx5blmG7SdbBXqGE5q5jKaMcwD3E9HfEU8iMI
j/v0kMT5qUCbO0Y+BmDoFp1v0DjKapR4eOG1Q7VJqm/LMm6No5YliYdpC42KFsLs
TqFnPOrWGjFgCEHosMc8EvECyCL0j3PeoG0qmK7YpgpNb9ckQRMNRcDmJYiOsPgv
IXEo60+gSZv5tsSlmeE25ldF7TuSWwwCo28Jr4FB4uWe6XtiGmGwrXgB4dNTmZUE
AfwulRsifgWT3LCzt2XPXJvc76XUg2HKJibQgRisvk/RiYC0EMAgz7XobGp0qFWw
3x0ED1NJle8DhLu04f3s
=7ZLw
-----END PGP SIGNATURE-----

Good points - I didn’t think about that very carefully, I guess. Most of the bandwidth is from CPU to memory, but data has to pass from CPU to CPU (the boundary conditions between each adjacent computational domain). So most data bandwidth won’t go across the GigE connection. I am not sure aside from looking at a bandwidth monitor during a calculation how to know whether GigE is slowing things down, and that might vary from one computational domain to another, since I’m also unsure how “smart” MPICH2 is at trying to arrange the calculations on each node so as to minimize GigE bandwidth. So the only things I know to do are to ask expert opinions, and maybe borrow someone’s gaming router and watch the gkrellm eth0 monitor while some calculations are running. Guess that’s the next step! :open_mouth:

Extra pcie ethernet cards are cheap, and I think the various hardware vendors are are well-supported by opensuse. (StarTech, Broadcom, or Intel?)

Thank you for the comments!!!
Patriciarotfl!

OK, I found a gigabit switch, which I can connect to my router - my understanding is that the switch is “smart enough” to shunt MPI traffic directly between the computers, after the router DCHP assigns IP addresses to each machine. It has been decades since I discussed switches with anyone - is this still the way they work?

Yes, a good switch will learn the MAC addresses of each connected line, and send data directly - with the exception of broadcast packets which have to be sent everywhere.

Thanks - I wasn’t sure if things have changed. I guess I should install a second ethernet port so that real requests past the switch won’t interfere with traffic on the switch. Now, off to figure out how to get the two boxes to recognize each other via MPICH2. Something about a hosts file, and maybe some firewalling, and maybe a MPI daemon…
Patricia :slight_smile: