oom killer back active after update...

ceinma · March 30, 2021, 2:24pm

jdivm04:~ # uname -a
Linux jdivm04 5.3.18-lp152.66-default #1 SMP Tue Mar 2 13:18:19 UTC 2021 (73933a3) x86_64 x86_64 x86_64 GNU/Linux

jdivm04:~ # lsb_release -a
LSB Version:    n/a
Distributor ID: openSUSE
Description:    openSUSE Leap 15.2
Release:        15.2
Codename:       n/a

Hi,

Using leap 15.2 and from time in time (usually each 3 months) I ran an update on my linux (zypper ref + zypper up).
I ran on this linux two databases (IBM Informix + mysql) and in the past I already have lot of issues with OOM KILLER.
So, I “disabled” it and haven’t any problem a couple years, until now.
The main solution, which I have set and still set is this configuration on sysctl :

jdivm04:/etc/sysctl.d # sysctl -a | grep overcomm
vm.nr_overcommit_hugepages = 0
vm.overcommit_kbytes = 0
vm.overcommit_memory = 2
vm.overcommit_ratio = 95

However, three weeks ago I ran my update and then OOM KILLER back in activity, killing with high frequency my databases and others services.
I have all set fine about my memory configuration and I really don’t understand why it still killing them.
The last time it kill the mysql , at dmesg the message finish with this text:

  +0.000001] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/,task=mysqld,pid=56172,uid=60
  +0.000065] Out of memory: Killed process 56172 (mysqld) total-vm:4673904kB, anon-rss:1218612kB, file-rss:0kB, shmem-rss:8kB
  +0.036022] oom_reaper: reaped process 56172 (mysqld), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

I already tried alternative solutions to deactivate it , like below, with no effect:

echo 0 > /sys/fs/cgroup/memory/memory.use_hierarchy
mkdir /sys/fs/cgroup/memory/0
echo 1 > /sys/fs/cgroup/memory/0/memory.oom_control

As far I remember, before the update I was running kernel 5.3.18-lp152.60 or 63, not sure , now is 66. (since I ran new updates to see if solve this behave)

***Any tips how to deactivate for good the OOM KILLER? ***

dcurtisfra · March 30, 2021, 5:56pm

@ceinma:

The Oracle article “How to Configure the Linux Out-of-Memory Killer” mentions the following –

The OOM killer can be completely disabled with the following command. This is not recommended for production environments, because if an out-of-memory condition does present itself, there could be unexpected behavior depending on the available system resources and configuration. This unexpected behavior could be anything from a kernel panic to a hang depending on the resources available to the kernel at the time of the OOM condition.


sysctl vm.overcommit_memory=2
echo "vm.overcommit_memory=2" >> /etc/sysctl.conf

<How to Configure the Linux Out of Memory Killer;

john_hudson · March 31, 2021, 11:15pm

Yes, I also find that settings are occasionally overwritten during an update but what is the problem with increasing available swap memory so that oom killer is never invoked?

ceinma · April 1, 2021, 2:14pm

dcurtisfra:

@ceinma:

The Oracle article “How to Configure the Linux Out-of-Memory Killer” mentions the following –
sysctl vm.overcommit_memory=2
echo "vm.overcommit_memory=2" >> /etc/sysctl.conf
<How to Configure the Linux Out of Memory Killer;

Hi @dcurtisfra ,
Yes, if you read my message will notice I already have this configuration set and not working anymore.

ceinma · April 1, 2021, 2:38pm

I also tried that doubling my swap to get the memory calculation of CommitLimit vs Committed_AS with considerable free space and even that doesn’t work.
I begin to monitor the /proc/meminfo/Committ* to have sure about my memory usage and is far away to get close of consume all them , check the graph below for the last 7 days. The lowers at Commit_AS is when the oomkiller killed some of my databases…

(tried to embed the image, but this editor give me an error)

It just killed again my IBM Informix database… errrr I hate this OOM

dcurtisfra · April 1, 2021, 4:59pm

It seems that, the newest Kernel doesn’t allow the OOM-Killer to be disabled …

Therefore, increase the swap space and tune the Kernel’s swap behaviour –
openSUSE Doc. – <https://doc.opensuse.org/documentation/leap/tuning/html/book-sle-tuning/cha-tuning-memory.html#cha-tuning-memory-vm>.
Main parameters – “swappiness” and “vfs_cache_pressure”.
Explanations here – <https://haydenjames.io/linux-performance-almost-always-add-swap-space/> and <https://haydenjames.io/linux-performance-almost-always-add-swap-space/>.

tsu2 · April 2, 2021, 4:23am

In this article I wrote long ago, I described how to shift resources from the system to networking.
You can use the methods I describe to read various parameters and possibly do the reverse to shift more resources to your system (I don’t recommend permanently, maybe do so only for the specific task).

https://sites.google.com/site/4techsecrets/optimize-and-fix-your-network-connection/linux

Also,
In the article I wrote how to use the Free tool to analyze memory,
I also included the command to “Clear memory buffers and cache” which I’m guessing should be very useful in your case. The command should be harmless although I would of course caution not to do this in the middle of some long calculation. Wait until your system is quiet, and the main effect is that any data that might have been in memory will simply have to be retrieved again or re-calculated.

https://en.opensuse.org/User:Tsu2/free_tool

Also, beware configuring too much swap, swap requires using space in RAM to point to the nee locations on disk so you can actually be making your memory starvation even worse by creating more swap.

Needless to say,
If a system update is causing memory problems, it’s likely a warning that your system has been provisioned to the max, and could use a RAM upgrade.

TSU

dcurtisfra · April 3, 2021, 12:48pm

@ceinma:

Yes, exactly –

The workarounds I pointed to are really only a “piece of plaster on the wound
” – to really address the issue the affected machines urgently need more memory (if physically possible) … - If the machines are all provisioned with the maximum amount of memory physically possible then, you’ll have to look at purchasing more servers to spread the application load …

[HR][/HR]I didn’t really want to write this because, it’s the “normal” method to deal with such issues …

ceinma · April 5, 2021, 2:20pm

So , for me the unique and easy workaround is increase , again, my swap.
Before I had 4GB of swap and changed to 8GB, without success to solve this .
I will try again.

But I really want to found a kernel documentation saying the OOM killer is now 100% of the time active.
I’m looking and found nothing.

I will add more 4GB to my swap and see what happen, however, I think will change nothing…

I’m avoiding to grow the RAM since this is a VM sharing with others VMs , I don’t want to get resources unnecessary.

ceinma · April 5, 2021, 2:30pm

I don’t mentioned before, but I also have this settings for a long time :

vm.vfs_cache_pressure = 10
vm.dirty_background_ratio = 1
vm.dirty_expire_centisecs = 60000
vm.dirty_ratio = 1


vm.swappiness = 0

Now I changed the swappiness from 0 to 5 , let’s see if change any behave of this **** OOM .

tsu2 · April 5, 2021, 2:43pm

If your vms don’t have to be accessed 24/7/365, maybe find a time you can poweroff some vms?
Also, inspect your settings for minimum reserved RAM, ballooning so you understand when that might happen.
Consider that unless you’re running some pretty fast RAID, pushing all that data into swap could make your machine run dog-slow, almost looking like the system is frozen.

TSU

dcurtisfra · April 6, 2021, 7:08pm

If the other VMs aren’t running databases, you may well have to accept increasing the RAM usage of the VM running the database application …

And, possibly, increase the physical RAM on the machine …

Svyatko · April 12, 2021, 3:08pm

ceinma:

I don’t mentioned before, but I also have this settings for a long time :
vm.vfs_cache_pressure = 10
vm.dirty_background_ratio = 1
vm.dirty_expire_centisecs = 60000
vm.dirty_ratio = 1


vm.swappiness = 0
Now I changed the swappiness from 0 to 5 , let’s see if change any behave of this **** OOM .

vm.swappiness = 0 means that swap is disabled.

Changing the value directly influences the performance of the Linux system. These values are defined:

0: swap is disable

1: minimum amount of swapping without disabling it entirely

10: recommended value to improve performance when sufficient memory exists in a system

100: aggressive swapping

The value of 60 is a compromise that works well for modern desktop systems. A smaller value is a recommended option for a server system, instead. As the Red Hat Performance Tuning manual points out [8], a smaller swappiness value is recommended for database workloads. For example, for Oracle databases, Red Hat recommends a swappiness value of 10. In contrast, for MariaDB databases, it is recommended to set swappiness to a value of 1 [9].

Cloudera recommends that you set vm.swappiness to a value between 1 and 10, preferably 1, for minimum swapping on systems where the RHEL kernel is 2.6.32-642.el6 or higher.

RTFM!

To get more memory you may use zram/zswap/zcache.

gendjaral · April 12, 2021, 9:56pm

Even SSD based swap space is so much slower than physical RAM…
The old school formula (2x RAM space = swap swap space) seems deprecated for me. In my opinion, as soon as swap space will be used often times, more phys memory is mandatory.
There aren’t unnecessary resources existent on unix based systems. RAM will also used to cache/buffer frequently operations. If no cache possibility is existent, the system is getting much more slower. And in the light of CPUs these days… swap space as cache is only in barely situations what you want.

Some of our servers have 768 GB of RAM and only 32 GB of swap. Just to illustrate the ratios. A 16GB VM will get 8-16GB swap.

Best regards,

tsu2 · April 13, 2021, 2:01am

Slow throughput is primarily because of your SATA or SAS interface.
I’ve read that if your swap is on NVME M.2 it’s incredibly fast, almost like regular RAM (but don’t rely on that. Get real RAM instead, especially because prices will likely be similar).

And, I wouldn’t recommend configuring more than a tiny amount of swap space unless you really are anticipating the possibility of unusually high load… As I’ve said before, a small amount of RAM is lost to supporting SWAP and the more SWAP you use, the more will be stolen from your regular RAM.

TSU

Svyatko · April 13, 2021, 11:02am

RAM is much faster, especially in IOPS. SSD flash is much cheaper than RAM.
Optane is in between of them.

ceinma · April 13, 2021, 1:13pm

Hi !

I really thanks everyone about all suggestions, however my focus here is why this behave back, what change? is the kernel version?

I don’t want to change my RAM settings, this environment ran in this condition for the last two years without any issues.
Really don’t believe the reason is the OS upgrade and they become to consume more memory and triggered all this.

Enabling the swap now reduced the killer, giving me more time working, but still happen, yesterday triggered again and kill everything included my sessions (old ssh sessions).

Not sure, but I think this is a kernel resource right?
I update the kernel with last minor fix (to 5.3.18-lp152.69-default) and growing the swappiness from 20 , if happen again, I will try to downgrade my kernel to version 4.

gendjaral · April 13, 2021, 5:56pm

As far as I know…
The oom killer could never be disabled. At no time or kernel release…
I’ve never heard that before. And in my opinion, it wouldn’t make sense.

I would bet that the extended memory consumption comes from another place and the dimension of your virtual machine just don’t fit anymore.