Well, this isn’t really Big Iron - it’s a Magny Cours 4x12CPU Opteron box by Supermicro. I had every-other memory slot populated by 8 GB Kingston DDR3 ECC registered modules for a total of 128GB (16 x 8GB). Then, just last week, I added 16 more modules to populate all slots (256 GB) and now the KDE desktop is slowed - almost as if I was using an un-accelerated graphics card. I didn’t change any settings on the OS. I turned off ordinary desktop effects, but it still behaves like a single core pentium. I can’t think of any software to test the memory (I think memtest86 doesn’t do more than 32 GB). (I didn’t make any BIOS or openSUSE system changes - I just powered down, added memory, and rebooted.)
My other thought was that some Wizard may say, “We haven’t optimized openSUSE to run efficiently with that much memory on a NUMA system.” I don’t want to install another distro if I don’t have to. Is there a way to figure out what’s going on built into openSUSE 12.1 x64? Also, are there any other tests folks would suggest to debug this?
In addition, I’m having trouble allocating processes among openmpi threads. I know there is enough physical memory, but I get:
patti@OS121-TY3:~/ModelE/modelE_AR5_v2_branch_04-30-2013/decks> ../exec/runE E300PMS -cold-restart -np 12
submitting ./E300PMS -cold-restart -np 12
Operating system error: Cannot allocate memory
Memory allocation failed
Operating system error: Cannot allocate memory
Memory allocation failed
Operating system error: Cannot allocate memory
Memory allocation failed
Operating system error: Cannot allocate memory
Memory allocation failed
Operating system error: Cannot allocate memory
Memory allocation failed
Operating system error: Cannot allocate memory
Memory allocation failed
Operating system error: Cannot allocate memory
Memory allocation failed
--------------------------------------------------------------------------
mpirun has exited due to process rank 4 with PID 25993 on
node OS121-TY3 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
Problem encountered while running
>>> Unknown reason <<<
patti@OS121-TY3:~/ModelE/modelE_AR5_v2_branch_04-30-2013/decks>
Other than the general slowness and the inability to get a large openmpi problem running, the system seems reliable.
On 05/03/2013 01:56 PM, PattiMichelle wrote:
>
> Well, this isn’t really Big Iron - it’s a Magny Cours 4x12CPU Opteron
> box by Supermicro. I had every-other memory slot populated by 8 GB
> Kingston DDR3 ECC registered modules for a total of 128GB (16 x 8GB).
> Then, just last week, I added 16 more modules to populate all slots (256
> GB) and now the KDE desktop is slowed - almost as if I was using an
> un-accelerated graphics card. I didn’t change any settings on the OS.
> I turned off ordinary desktop effects, but it still behaves like a
> single core pentium. I can’t think of any software to test the memory
> (I think memtest86 doesn’t do more than 32 GB). (I didn’t make any BIOS
> or openSUSE system changes - I just powered down, added memory, and
> rebooted.)
>
> My other thought was that some Wizard may say, “We haven’t optimized
> openSUSE to run efficiently with that much memory on a NUMA system.” I
> don’t want to install another distro if I don’t have to. Is there a way
> to figure out what’s going on built into openSUSE 12.1 x64? Also, are
> there any other tests folks would suggest to debug this?
>
> sysinfo:/
> Total memory (RAM): 252.4 GiB
> Free memory: 228.2 GiB (+ 12.2 GiB Caches)
> Free swap: 0.0 KiB
>
> In addition, I’m having trouble allocating processes among openmpi
> threads. I know there is enough physical memory, but I get:
>
>
> Code:
> --------------------
> patti@OS121-TY3:~/ModelE/modelE_AR5_v2_branch_04-30-2013/decks> …/exec/runE E300PMS -cold-restart -np 12
> submitting ./E300PMS -cold-restart -np 12
> Operating system error: Cannot allocate memory
> Memory allocation failed
> Operating system error: Cannot allocate memory
> Memory allocation failed
> Operating system error: Cannot allocate memory
> Memory allocation failed
> Operating system error: Cannot allocate memory
> Memory allocation failed
> Operating system error: Cannot allocate memory
> Memory allocation failed
> Operating system error: Cannot allocate memory
> Memory allocation failed
> Operating system error: Cannot allocate memory
> Memory allocation failed
> --------------------------------------------------------------------------
> mpirun has exited due to process rank 4 with PID 25993 on
> node OS121-TY3 exiting without calling “finalize”. This may
> have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
> --------------------------------------------------------------------------
> Problem encountered while running
> >>> Unknown reason <<<
> patti@OS121-TY3:~/ModelE/modelE_AR5_v2_branch_04-30-2013/decks>
> --------------------
>
>
> Other than the general slowness and the inability to get a large
> openmpi problem running, the system seems reliable.
From what I know, any tuning of openSUSE, and Linux in general, is much more
important at low memory size.
Is it possible for you to add fewer that all 16 modules above 128 GB? Perhaps
there is some threshold.
My instincts are that this is a kernel, not a distro, problem. I would try a
newer kernel than the 3.1 version in openSUSE 12.1. If that still yields the
same symptoms, then I would post the problem on linux-kernel@vger.kernel.org.
Finally, are you using a standard kernel, or one you build yourself? If
standard, you are using the SLAB memory allocator. Perhaps SLUB is better for
large memory systems.
I am not sure what the memory limit on the kernel is, but you can artificially lower the amount of memory used to see if it makes it more reliable. Just add this to grub options to set memory limit to 240g (or change to whatever value you want).
> Code:
> --------------------
> patti@OS121-TY3:~/ModelE/modelE_AR5_v2_branch_04-30-2013/decks> …/exec/runE E300PMS -cold-restart -np 12
> submitting ./E300PMS -cold-restart -np 12
> Operating system error: Cannot allocate memory
> Memory allocation failed
> Operating system error: Cannot allocate memory
> Memory allocation failed
> Operating system error: Cannot allocate memory
> Memory allocation failed
> Operating system error: Cannot allocate memory
> Memory allocation failed
> Operating system error: Cannot allocate memory
> Memory allocation failed
> Operating system error: Cannot allocate memory
> Memory allocation failed
> Operating system error: Cannot allocate memory
> Memory allocation failed
> --------------------------------------------------------------------------
> mpirun has exited due to process rank 4 with PID 25993 on
> node OS121-TY3 exiting without calling “finalize”. This may
> have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
> --------------------------------------------------------------------------
> Problem encountered while running
> >>> Unknown reason <<<
> patti@OS121-TY3:~/ModelE/modelE_AR5_v2_branch_04-30-2013/decks>
> --------------------
>
>
> Other than the general slowness and the inability to get a large
> openmpi problem running, the system seems reliable.
From what I know, any tuning of openSUSE, and Linux in general, is much more
important at low memory size.
Is it possible for you to add fewer that all 16 modules above 128 GB? Perhaps
there is some threshold.
My instincts are that this is a kernel, not a distro, problem. I would try a
newer kernel than the 3.1 version in openSUSE 12.1. If that still yields the
same symptoms, then I would post the problem on linux-kernel@vger.kernel.org.
Finally, are you using a standard kernel, or one you build yourself? If
standard, you are using the SLAB memory allocator. Perhaps SLUB is better for
large memory systems.
Larry
Hi Larry: Thank you for the reply!
I’m just using vanilla openSUSE 12.1 I haven’t seen this problem before and
have been running these models for some time. Is it possible that something
is broke with my installation - or maybe that I have no swapfile? (I didn’t think
it was necessary with 256GB registered ECC server memory.)
On 05/04/2013 01:36 AM, PattiMichelle wrote:
> Is it possible that
> something is broke with my installation - or maybe that I have no swapfile?
why not just add a small (say 4 gb) swap file and see what happens…
you know the system will ‘manage’ memory and swap space…so, maybe it is slowed down looking around for a swap file to manage …
(probably not likely–but you asked the question, and you have the
machine to TEST it on, so give it a try . . .)
–
dd
openSUSE®, the “German Engineered Automobile” of operating systems!
I should stop posting before the first cup of coffee…
By googling I learned that there are systems, where the BIOS or the motherboard’s firmware hide the fact that there are multiple cpus from the operating system. In this case NUMA tools are of no use and can’t help with/explain your problems.
Yes, I forgot to mention that I also took out that memory. It’s very odd - the machine was faster again. I think openSUSE is suboptimally tuned for unusually large amounts of memory. It’s weird that the graphics display would slow down at 256GB but be more or less normal at 128GB. I didn’t experience any memory errors (e.g. bad memory sticks), however. I’m willing to test, but memtest86 won’t handle 256GB, or even 128GB.
I think there are a some BIOS settings that affect this, node interleave, bank interleave, memory “swizzle” (which I don’t really understand). I guess maybe I could play around with those. That’s 2^3 reboot+tests? There may be one other setting I’m forgetting, so that would be 2^4 reboots.
I’m not sure how to set up the BIOS. The OS seemed pretty smart in handling memory during previous tests (before adding the extra 128GB to take it up to 256GB). And big openmpi calculations usually would tie local memory to each processor, so SMP would not be as useful as NUMA. Still, why the painful slow down in KDE desktop? It’s just a cheap accelerated GPU display, but it is accelerated. Why would the memory size so negatively impact KDE?
I remember in the past some Intel CPU (or may be chipsets) would cache memory up to some size. Adding memory beyond this size would result in slow down due to uncached access.
Could you check /proc/mtrr whether “old” and “new” memory range has any difference?
AFAIK these BIOS setting are not about NUMA. Usually there is a switch between “SMP” and “NUMA” or alike, if the manufacturer supports this. openmpi seems to be NUMA-aware, so activating NUMA could help openmpi.
KDE: Do you have any other desktop environments or window managers installed? If yes, do they suffer from the same slowness?
We run 48 core HP servers with 256GB memory and I can assure you this is not an issue with Linux being poorly tuned for large memory configurations, etc.
While there potentially may be something with the Suse Desktop or Default kernels, it would be easy enough to find out by building a kernel.org kernel and testing with that.
However, I would first check the following:
Are there any messages in dmesg that look interesting?
mcelog show anything?
Does the Supermicro’s System Event Log show anything?
To access SEL, you might use IMPI view from SuperMicro: ftp://ftp.supermicro.com/utility/IPMIView/IPMIView20.pdf
(Those articles reference Dell hardware, but it should make little difference as IPMI is most standard between vendors, and impiutil should work fine for you on the Supermicro.)
I would definitly check this and see if the system is logging any memory issues. If so, it might report the specific DIMM, which could them be isolated for testing with memtest86
BIOS memory configuration settings
Some systems will automatically configure memory in different configurations depending on if all slots are populate. HP servers have “redundant”, “Advanced Memory Protection” and “global” memory modes, which can be enabled depending on how the memory is installed. It’s possible Supermicro has similar advanced memory modes, and the system documentation should cover this.
Are you getting the openmpi allocation failures when only running with 128GB?
Do you have support from the vendor / supplier? Depending on who this is, they might be an excellent resource.
I took out the 128GB ECC registered server memory and ran my global climate model for a couple of days. Yes, KDE was back to its old fast-response, and there were no errors. Now I have added the 128GB back in (for 256GB total all ECC registered DDR3) and restarted my GCM model. Yes, KDE is again ~2x or 3x slower than normal (both in responsiveness and FPS) and I restarted my GCM and it seems to be running fine (it uses ~22GB and hops around the processor/memory space and so should test everything). This should rule-out memory hardware problems. I don’t have any way to monitor HyperTransport bottlenecks, do I? I guess I could try to move this over to Win2k3ServerEnterprise x64 and use some AMD tool or other to look for HT bottlenecks.
The failure to run OpenMPI threads I first reported is probably a code bug.
Thank you for the valuable information and good ideas welcome
Patricia
Thanks for the reply - I’m still trying to find why the openmpi is crashing - I don’t
really understand this output - any suggestions where the problem might be?
(I had the same openmpi crashes in 128GB, so it might be a FORTRAN problem,
but someone familiar with mpi might recognize a troublesome setting for either
the KDE or MPI problem):
patti@OS121-TY3:~/ModelE/modelE_AR5_v2_branch_04-30-2013/decks> ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 2067141
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 2067141
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
patti@OS121-TY3:~/ModelE/modelE_AR5_v2_branch_04-30-2013/decks>