openSUSE 13.1 - mpirun (openmpi) does not run.

Hello,

I have installed 13.1 and i tried to run parallel procceses with mpirun and i get the following

mpirun -n 2 ./john --test

It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here’s some additional information (which may only be relevant to an
Open MPI developer):

opal_shmem_base_select failed
→ Returned value -1 instead of OPAL_SUCCESS


It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here’s some additional information (which may only be relevant to an
Open MPI developer):

opal_init failed
→ Returned value Error (-1) instead of ORTE_SUCCESS

does anyone has the same problem ?

uname -a
Linux wakarako 3.11.6-4-desktop #1 SMP PREEMPT Wed Oct 30 18:04:56 UTC 2013 (e6d4a27) i686 i686 i386 GNU/Linux--------------

zypper se -s -i mpi
Loading repository data…
Reading installed packages…

S | Name | Type | Version | Arch | Repository
–±-------------------±--------±-------------±-------±------------------
i | libboost_mpi1_53_0 | package | 1.53.0-4.1.2 | i586 | openSUSE-13.1-Oss
i | libboost_mpi1_53_0 | package | 1.53.0-4.1.2 | i586 | openSUSE-13.1-1.10
i | mpi-selector | package | 1.0.3-9.1.2 | noarch | openSUSE-13.1-Oss
i | mpi-selector | package | 1.0.3-9.1.2 | noarch | openSUSE-13.1-1.10
i | openmpi | package | 1.7.2-2.1.3 | i586 | openSUSE-13.1-Oss
i | openmpi | package | 1.7.2-2.1.3 | i586 | openSUSE-13.1-1.10

cat /proc/meminfo
MemTotal: 5031248 kB
MemFree: 330036 kB
Buffers: 93580 kB
Cached: 3250836 kB
SwapCached: 7252 kB
Active: 3139252 kB
Inactive: 1311504 kB
Active(anon): 2539224 kB
Inactive(anon): 713196 kB
Active(file): 600028 kB
Inactive(file): 598308 kB
Unevictable: 0 kB
Mlocked: 0 kB
HighTotal: 4206596 kB
HighFree: 209520 kB
LowTotal: 824652 kB
LowFree: 120516 kB
SwapTotal: 2101240 kB
SwapFree: 2071512 kB
Dirty: 3084 kB
Writeback: 0 kB
AnonPages: 1099252 kB
Mapped: 242608 kB
Shmem: 2146080 kB
Slab: 93676 kB
SReclaimable: 69788 kB
SUnreclaim: 23888 kB
KernelStack: 3240 kB
PageTables: 17364 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 4616864 kB
Committed_AS: 7386268 kB
VmallocTotal: 122880 kB
VmallocUsed: 58332 kB
VmallocChunk: 56820 kB
HardwareCorrupted: 0 kB
AnonHugePages: 202752 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 122872 kB
DirectMap2M: 780288 kB

cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Pentium(R) Dual CPU E2160 @ 1.80GHz
stepping : 13
microcode : 0xa1
cpu MHz : 1200.000
cache size : 1024 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fdiv_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm lahf_lm dtherm
bogomips : 3590.82
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Pentium(R) Dual CPU E2160 @ 1.80GHz
stepping : 13
microcode : 0xa1
cpu MHz : 1200.000
cache size : 1024 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
apicid : 1
initial apicid : 1
fdiv_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm lahf_lm dtherm
bogomips : 3590.82
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual

I am considering raising a bug but i am not sure if this is an open suse bug or openMPI 1.7.2 one ?
Could somebody with 13.1 installation check if mpirun ( openMPI 1.7.2 ) works ? I think a simple command like

mpirun -np 2 ls

is enough to verify.

thanks.

P.S. I am thinking as an alternative is to compile the source code of openMPI (1.6.5 version) at my own …

Well i found a solution to the above problem and i am witing it here just for anobody who will affected by the same bug.

I uninstalled mpi-selector

zypper rm mpi-selector
Loading repository data…
Reading installed packages…
Resolving package dependencies…

The following packages are going to be REMOVED:
boost-devel libboost_graph1_53_0 libboost_mpi1_53_0 mpi-selector openmpi

5 packages to remove.
After the operation, 97.2 MiB will be freed.
Continue? [y/n/? shows all options] (y):
(1/5) Removing boost-devel-1.53.0-4.1.2 …[done]
(2/5) Removing libboost_graph1_53_0-1.53.0-4.1.2 …[done]
(3/5) Removing libboost_mpi1_53_0-1.53.0-4.1.2 …[done]
(4/5) Removing openmpi-1.7.2-2.1.3 …[done]
Additional rpm output:
default:openmpi-1.7.2
level:system

(5/5) Removing mpi-selector-1.0.3-9.1.2 …[done]

Then i downloaded openmpi-1.6-3.1.2.i586.rpm and openmpi-devel-1.6-3.1.2.i586.rpm from 12.3 repos,
installed them

zypper in ./openmpi-1.6-3.1.2.i586.rpm openmpi-devel-1.6-3.1.2.i586.rpm
Loading repository data…
Reading installed packages…
Resolving package dependencies…

The following NEW packages are going to be installed:
mpi-selector openmpi openmpi-devel

3 new packages to install.
Overall download size: 5.5 MiB. After the operation, additional 17.4 MiB will be used.
Continue? [y/n/? shows all options] (y):
Retrieving package mpi-selector-1.0.3-9.1.2.noarch (1/3), 24.4 KiB ( 48.8 KiB unpacked)
Retrieving: mpi-selector-1.0.3-9.1.2.noarch.rpm …[done]
Retrieving package openmpi-devel-1.6-3.1.2.i586 (2/3), 3.7 MiB ( 8.7 MiB unpacked)
Retrieving package openmpi-1.6-3.1.2.i586 (3/3), 1.8 MiB ( 8.7 MiB unpacked)
(1/3) Installing: mpi-selector-1.0.3-9.1.2 …[done]
Retrieving package openmpi-devel-1.6-3.1.2.i586 (1/3), 3.7 MiB ( 8.7 MiB unpacked)
(2/3) Installing: openmpi-devel-1.6-3.1.2 …[done]
Retrieving package openmpi-1.6-3.1.2.i586 (2/3), 1.8 MiB ( 8.7 MiB unpacked)
(3/3) Installing: openmpi-1.6-3.1.2 …[done]

After that i got the following error

mpirun -n 2 ls

Open RTE was unable to open the hostfile:
/usr/lib/mpi/gcc/openmpi/etc/openmpi-default-hostfile
Check to make sure the path and filename are correct.

[wakarako:10091] [11317,0],0] ORTE_ERROR_LOG: Not found in file base/ras_base_allocate.c at line 200
[wakarako:10091] [11317,0],0] ORTE_ERROR_LOG: Not found in file base/plm_base_launch_support.c at line 99
[wakarako:10091] [11317,0],0] ORTE_ERROR_LOG: Not found in file plm_rsh_module.c at line 1167

The problem is that openMPI is looking for file openmpi-default-hostfile not in /etc where is located in my system but in /usr/lib/mpi/gcc/openmpi/etc
so i went in /usr/lib/mpi/gcc/openmpi made etc directory and then i changed to that etc and i made a symbolink link for file openmpi-default-hostfile in /etc

cd /usr/lib/mpi/gcc/openmpi
mkdir etc
cd etc
ln -s /etc/openmpi-default-hostfile

and YEeeeesssss …

mpirun -n 2 ./john --test
Benchmarking: Traditional DES [128/128 BS SSE2]… (2xMPI) DONE
Many salts: 4294M c/s real, 4294M c/s virtual
Only one salt: 4294M c/s real, 4294M c/s virtual

Benchmarking: BSDI DES (x725) [128/128 BS SSE2]… (2xMPI) DONE
Many salts: 4294M c/s real, 4294M c/s virtual
Only one salt: 4294M c/s real, 4294M c/s virtual

.
.
.