Potential memory leak in Slab?

Hi All,

I’ve had a server that’s been running Tumbleweed for a year or two now. I update it every now to the latest tumbleweed. I’ve just recently put Apache, mod_perl, and MySQL on the server to run a particular web application. Ever since then the memory on this server (which I’ve increased from 4 GB to 8 GB) continues to grow until the whole box craps itself due to running our of memory. It normally only takes a week. The uncontrolled growth happens within slab memory.

The btrfs caches grow significantly, and running

sync; echo 3 > /proc/sys/vm/drop_caches

doesn’t do anything at all. kmalloc-4k is also continually growing. See below:


  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME1188396 1187627  99%    1.11K 396132        3   1584528K btrfs_inode
1154738 1154692  99%    0.30K  88826       13    355304K btrfs_delayed_node
1170036 1150806  98%    0.14K  41787       28    167148K btrfs_extent_map
1413951 1063610  75%    0.19K  67331       21    269324K dentry
361920 361837  99%    0.12K  11310       32     45240K kmalloc-128
349248 344170  98%    0.06K   5457       64     21828K kmalloc-64
327872 327859  99%    0.25K  20492       16     81968K kmem_cache
237708 234122  98%    0.03K   1917      124      7668K kmalloc-32
197218 197185  99%    4.00K 197218        1    788872K kmalloc-4k
100124 100077  99%    2.00K  50062  : 336 2    200248K kmalloc-2k
 88348  88161  99%    1.00K  22087t : 0.0 4     88348K kmalloc-1k
 63712  63550  99%    0.12K   1991       32      7964K kernfs_node_cache
 56640  56614  99%    0.25K   3540 OBJ/SL16     14160K kmalloc-256
 66948  50990  76%    0.56K   956456      7     38256K radix_tree_node
 49824  47266  94%    0.12K   155759     32      6228K kmalloc-96
 29160  26347  90%    0.20K   1458       20      5832K vm_area_struct
 23160  22881  98%    0.50K   2895        8     11580K kmalloc-512
 20032  15633  78%    0.06K    313       64      1252K anon_vma_chain
 12712  12141  95%    0.07K    227       56       908K Acpi-Operand
 12558  11778  93%    0.57K   1794        7      7176K inode_cache
 12250   8992  73%    0.08K    245       50       980K anon_vma
  7425   7173  96%    0.04K     75       99       300K Acpi-Namespace

What can I do to further troubleshoot this? I’m at a loss.

Btrfs is not always the best for everybody and not for every application.

Are your MySQL database files stored in btrfs subvolumes? According to https://en.opensuse.org/SDB:BTRFS locations like /var/lib/mariadb or /var/lib/mysql should be exempt from snapshots because it could cause frequent filesystem writes, excessive wear on SSD storage, and prossibly the high btrfs-related counts you pasted and the effects on memory you observe.

If you happen to have a partition large enough for your database, try formatting it with ext4 and running your active database from there.

To understand slab memory and relationship to kmalloc, recommend reading

https://en.wikipedia.org/wiki/Slab_allocation
http://www.secretmango.com/jimb/Whitepapers/slabs/slab.html

You’ll find that these two types of memory management can be an indication of your system’s activity, but unless their values approach the limits of your physical memory, you shouldn’t be concerned with their growing sizes…

Summarizing,
You can think of both as virtual “containers” of physical memory blocks of various sizes, the slabs are larger block sequences and the kmallocs are smaller. Today’s modern systems recognize that it’s more costly to allocate/write/erase/delete data than to simply leave the data in memory, and a significant cost to writing is allocating properly sized contiguous memory blocks. Slab and kmalloc “containers” are pre-allocated mappings of various sized contiguous blocks to minimize that cost, when a system wants to write something to memory, it will first look for and try to re-use a properly sized mapping to write to, or erase/delete and write.

Bottom line,
You should look in other directions to improve performance, IMO website code is often abominable, and in particular it’s useful to understand how a website performs on a LAMP setup which will require and set up sufficient and persistent database connections compared to… for example nginx which is of a type that minimizes both database and server/client connection persistence with the idea it can be more efficient to create and destroy connections as quickly as possible instead of devoting resources to connections which may not be full.

In other words, it can be extremely important to do a website performance analysis, and this can require intimate knowledge not only of the code but then how best to deploy code the way it’s written.

HTH and open to any corrections if people feel I’ve described any of this incorrectly…
TSU

Just an additional thought…
A noticeable increase in 4k memory blocks (your kmalloc-4k) might indicate more webpages and/or components being created instead of being re-used.

Getting back to whatever your website is serving,
Try to understand which page objects are brand new with every webpage view, if pages stay relatively the same and can be served from cache or if objects are really changing, and if so how fast. 4k objects are pretty small which would be a bit unusual. If for instance you’re serving similar pages but with updated data, is that data only a line or two and not arrays of data?

TSU