Users using up memory => hang

PattiMichelle · January 6, 2014, 1:43am

I kind of think this is the right place to ask this. I don’t think ‘swappiness’ is the setting I am looking for - I’d like it if the kernel (OS 12.3 x64) could monitor command line memory/job requests, so that if it sees that new requests will overrun available memory, it will prevent new requests. This happens usually when someone runs a big OpenMPI job, with maybe 24 MPI threads. So it would have to kill the offending OpenMPI thread (which I think automatically kills all the related threads) and warn that this job exceeds available RAM. I have 128GB ram on the system. I don’t think I want to get rid of the swapfile, and I’m not sure what linux would do even if I did. <alt>-<sysreq>-REISUB doesn’t seem to work in these cases because of disk thrash, which, I assume, is a desperate attempt to use swap. The OOM killer seems to be indiscriminate to use, even if I did know how to use it.

Thanks In Advance, Patricia

anika200 · January 6, 2014, 1:57am

Are you looking for something like cgroups?

PattiMichelle · January 6, 2014, 3:39am

Wow - that looks over my head. I was hoping there was a kernel switch I could throw. The more I read about OOM killer, the more it sounds like what I need - but it appears to be buggy and difficult to set up.

lenwolf · January 6, 2014, 3:06pm

Hi, I don’t believe that the command line will be able to do what you’d like it to do. What you’d need would be some kind of task (or thread) supervisor that kills the task when it requests more than x GB of memory. Googling for “linux task supervisor” seems to yield quite some hits, whether they are what you need, I don’t know. HTH Lenwolf

martin_helm · January 6, 2014, 3:53pm

Maybe I completely misunderstand what the thread is about: But would
limiting the resources of the users with ulimit not be enough?

–
PC: oS 13.1 x86_64 | i7-2600@3.40GHz | 16GB | KDE 4.11 | GTX 650 Ti
ThinkPad E320: oS 13.1 x86_64 | i3@2.30GHz | 8GB | KDE 4.11 | HD 3000
HTPC: oS 13.1 x86_64 | Celeron@1.8GHz | 2GB | Gnome 3.10 | HD 2500

PattiMichelle · January 6, 2014, 4:55pm

Thanks. I will check that out. I guess I could limit the user to 90% of system memory? I’m just trying to prevent a de-facto hang when I (or any other user) spawn an OpenMPI process that takes too much memory.

martin_helm · January 6, 2014, 5:15pm

Am 06.01.2014 16:56, schrieb PattiMichelle:
> Thanks. I will check that out. I guess I could limit the user to
> 90% of system memory? I’m just trying to prevent a de-facto hang
> when I (or any other user) spawn an OpenMPI process that takes too
> much memory.
>
It will of course not prevent that 10 different users at the same time
allocate 90% summing up to 900%. So it will only help if it is unlikely
that this happens, the benefit is that it is easy.

Otherwise I am afraid you have to implement one of the more complex
solutions like cgroups (never did it myself) already mentioned here.

Yet another possibility to mitigate the problem is to add a solid state
drive as swap (responsiveness is significantly better than swap on a
hard disk if the system starts to use it heavily, I use that myself on a
system where the appropriate amount of RAM would be too expensive and
the solid state drive did not degrade over 3 years of using it as swap).

–
PC: oS 13.1 x86_64 | i7-2600@3.40GHz | 16GB | KDE 4.11 | GTX 650 Ti
ThinkPad E320: oS 13.1 x86_64 | i3@2.30GHz | 8GB | KDE 4.11 | HD 3000
HTPC: oS 13.1 x86_64 | Celeron@1.8GHz | 2GB | Gnome 3.10 | HD 2500

PattiMichelle · January 6, 2014, 8:36pm

I think that’s why I was looking for a kernel setting - rather than trying to limit users…
I found this about kernel memory settings - does this sound right? (see blue below)

The Linux kernel: Memory

"Turning off overcommit Going in the wrong direction Since 2.1.27 there are a sysctl VM_OVERCOMMIT_MEMORY and proc file /proc/sys/vm/overcommit_memory with values 1: do overcommit, and 0 (default): don’t. Unfortunately, this does not allow you to tell the kernel to be more careful, it only allows you to tell the kernel to be less careful. With overcommit_memory set to 1 every malloc() will succeed. When set to 0 the old heuristics are used, the kernel still overcommits.

Going in the right direction Since 2.5.30 the values are: 0 (default): as before: guess about how much overcommitment is reasonable, 1: never refuse any malloc(), 2: be precise about the overcommit - never commit a virtual address space larger than swap space plus a fraction overcommit_ratio of the physical memory. Here /proc/sys/vm/overcommit_ratio (by default 50) is another user-settable parameter. It is possible to set overcommit_ratio to values larger than 100. (See also Documentation/vm/overcommit-accounting.)

After

echo 2 > /proc/sys/vm/overcommit_memory

all three demo programs were able to obtain 498 MiB on this 2.6.8.1 machine (256 MiB, 539 MiB swap, lots of other active processes), very satisfactory. However, without swap, no more processes could be started - already more than half of the memory was committed. After

echo 80 > /proc/sys/vm/overcommit_ratio

all three demo programs were able to obtain 34 MiB. (Exercise: solve two equations with two unknowns and conclude that main memory was 250 MiB, and the other processes took 166 MiB.) One can view the currently committed amount of memory in /proc/meminfo, in the field Committed_AS."

Patti

anika200 · January 10, 2014, 9:11pm

PattiMichelle:

I think that’s why I was looking for a kernel setting - rather than trying to limit users…
I found this about kernel memory settings - does this sound right? (see blue below)

The Linux kernel: Memory

"Turning off overcommit

Going in the wrong direction

Since 2.1.27 there are a sysctl VM_OVERCOMMIT_MEMORY and proc file /proc/sys/vm/overcommit_memory with values 1: do overcommit, and 0 (default): don’t. Unfortunately, this does not allow you to tell the kernel to be more careful, it only allows you to tell the kernel to be less careful. With overcommit_memory set to 1 every malloc() will succeed. When set to 0 the old heuristics are used, the kernel still overcommits.

Going in the right direction

Since 2.5.30 the values are: 0 (default): as before: guess about how much overcommitment is reasonable, 1: never refuse any malloc(), 2: be precise about the overcommit - never commit a virtual address space larger than swap space plus a fraction overcommit_ratio of the physical memory. Here /proc/sys/vm/overcommit_ratio (by default 50) is another user-settable parameter. It is possible to set overcommit_ratio to values larger than 100. (See also Documentation/vm/overcommit-accounting.)

After

echo 2 > /proc/sys/vm/overcommit_memory

all three demo programs were able to obtain 498 MiB on this 2.6.8.1 machine (256 MiB, 539 MiB swap, lots of other active processes), very satisfactory. However, without swap, no more processes could be started - already more than half of the memory was committed. After

echo 80 > /proc/sys/vm/overcommit_ratio

all three demo programs were able to obtain 34 MiB. (Exercise: solve two equations with two unknowns and conclude that main memory was 250 MiB, and the other processes took 166 MiB.) One can view the currently committed amount of memory in /proc/meminfo, in the field Committed_AS."

Patti

Talk about over my head… So you are satisfied with your findings? I still think cgroups is what you want and would encourage you to google it to find a friendlier resource than the one I linked too. However, I think that this depends of kernel 2.6x
?