RAM full and machine crashing - No idea why....

Hy again!

I have two Leap 15.1 KDE installs (one Dell Precision M6400, one Libretrend box), since last summer/fall both show a strange behaviour. They normally stream some VLC (via http, from raspberry pi machines) OR do a little VNC to other opensuse machines or raspberry pis. Sometimes a little Firefox/palemoon.

After some days, the RAM is apparently full (or at least the orange bar in htop fills up RAM available). And then the machine crashes.

Here a htop for one of the machines shortly before a crash

https://paste.opensuse.org/33129629

And here the journal for a crash event, nothing specific, only spam imho

https://paste.opensuse.org/32957974

I have the taskbar widget to monitor CPU, RAM and network (3 vertical bars) and when I see the RAM filling up, I do a reboot to avoid the crash.

The machines are both up to date.The Libretrend box has multimedia codecs in addition to standard repos, the Dell notebook is plain vanilla install.

I bought additional 4 GB RAM for the Dell machine, to stop the crashes from happening on a daily basis :-/

Any ideas what might be going on here?

Possibly /tmp is full.

Define “crashes”.

Any ideas what might be going on here?

RAM is full, you have no swap, so what exactly do you expect? Judging by your screenshot, firefox alone consumes all available memory.

/tmp full?

df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        3.9G     0  3.9G   0% /dev
tmpfs           3.9G     0  3.9G   0% /dev/shm
tmpfs           3.9G  1.4M  3.9G   1% /run
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/sda2        67G   11G   53G  17% /
/dev/sda3       153G   20G  126G  14% /home
tmpfs           787M     0  787M   0% /run/user/466
tmpfs           787M   16K  787M   1% /run/user/1000

and

df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        3.9G     0  3.9G   0% /dev
tmpfs           3.9G   24M  3.9G   1% /dev/shm
tmpfs           3.9G  1.6M  3.9G   1% /run
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/md126p8     20G   10G  8.7G  54% /
/dev/md126p7     75G   45G   29G  62% /home
tmpfs           796M   16K  796M   1% /run/user/1000

I don’t have swap on many other installs, without problems and oing much more firefoxing etc. So I don’t really see the point.

Is there a way to add some swap to test? I have on one HDD/SSD each an EXT4 for / and an EXT4 for /home… Shrink /home and create a swap in YaST? But how to tell the system to use it?

You may use a swapfile: https://wiki.archlinux.org/index.php/Swap#Swap_file

This is the Libretrend box after rebooting 12 h ago and sitting with 3 VLC streaming windows open over night:

https://paste.opensuse.org/57025932

Is there some kind of memory leak in VLC? Or is it the http buffering of VLC for the streams? I have other machines with TW just doing fine with comparable setup…

PS: on the same Libretrend box I have:

free -m
              total        used        free      shared  buff/cache   available
Mem:           7868        1280        2081        2612        4506        3702
Swap:             0           0           0

OK, consensus is, I need a swap, so I created swap on both machines:

free -h
              total        used        free      shared  buff/cache   available
Mem:          7.7Gi       698Mi       6.1Gi       331Mi       919Mi       6.4Gi
Swap:         4.0Gi          0B       4.0Gi


free -h
              total        used        free      shared  buff/cache   available
Mem:          7.8Gi       797Mi       6.0Gi       287Mi       1.0Gi       6.5Gi
Swap:         2.0Gi          0B       2.0Gi


But when I bought another 4GB RAM for the Dell it only took a little longer until the system froze, so I’m not that optimistic…

If I’m not mistaken, I see 5 instances of xvnc each using 18% of RAM for a total of 90.1% of RAM used.
Then there are some 25 instances of Firefox, but that seems just the tipping element…

I am not sure where you have seen it. You need to find out what is causing memory shortage. If you add swap and something will continue to allocate (and actually use) more and more memory you simply end up with Out-Of-Memory again (but before you will probably observe significant delays due to swapping).

But when I bought another 4GB RAM for the Dell it only took a little longer until the system froze

Exactly.

Hi
I wonder if running slabtop may help here…

OK, but what will slabtop do?

The Libretrend runs 3 VLC players with http streams and starts to fill up RAM:

In the morning:

free -h
        total        used        free      shared  buff/cache   available 
Mem:          7.7Gi       698Mi       6.1Gi       331Mi       919Mi       6.4Gi
Swap:         4.0Gi          0B       4.0Gi

Now:

free -h
              total        used        free      shared  buff/cache   available
Mem:          7.7Gi       1.1Gi       5.3Gi       597Mi       1.2Gi       5.7Gi
Swap:         4.0Gi          0B       4.0Gi


The Dell is running three instances of Tigervnc viewer and starts to fill RAM:

In the morning:

free -h
        total        used        free      shared  buff/cache   available 
Mem:          7.8Gi       797Mi       6.0Gi       287Mi       1.0Gi       6.5Gi
Swap:         2.0Gi          0B       2.0Gi

Now:

free -h
              total        used        free      shared  buff/cache   available
Mem:          7.8Gi       917Mi       5.3Gi       819Mi       1.5Gi       5.8Gi
Swap:         2.0Gi          0B       2.0Gi

On the Dell we have:

 Active / Total Objects (% used)    : 297296 / 309360 (96.1%)
 Active / Total Slabs (% used)      : 19497 / 19498 (100.0%)
 Active / Total Caches (% used)     : 70 / 112 (62.5%)
 Active / Total Size (% used)       : 77908.60K / 79689.08K (97.8%)
 Minimum / Average / Maximum Object : 0.02K / 0.26K / 4096.00K

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
 10452  10439  99%    1.05K   3484        3     13936K ext4_inode_cache
 17772  17746  99%    0.58K   2962        6     11848K inode_cache
 50442  50160  99%    0.19K   2402       21      9608K dentry
 11648  11633  99%    0.56K   1664        7      6656K radix_tree_node
 31560  30594  96%    0.20K   1578       20      6312K vm_area_struct
 33306  33274  99%    0.10K    854       39      3416K buffer_head
 25888  25735  99%    0.12K    809       32      3236K kernfs_node_cache
 11968  11263  94%    0.25K    748       16      2992K kmalloc-256
   372    372 100%    7.25K    372        1      2976K task_struct
  4356   4270  98%    0.65K    726        6      2904K proc_inode_cache
   244    243  99%    8.00K    244        1      1952K kmalloc-8192
    30     30 100%   64.00K     30        1      1920K kmalloc-65536
  1852   1708  92%    1.00K    463        4      1852K kmalloc-1024
   916    911  99%    2.00K    458        2      1832K kmalloc-2048
   384    384 100%    4.00K    384        1      1536K kmalloc-4096
  1980   1946  98%    0.68K    180       11      1440K shmem_inode_cache
 20672  17224  83%    0.06K    323       64      1292K anon_vma_chain
     8      8 100%  128.00K      8        1      1024K kmalloc-131072
 29016  28835  99%    0.03K    234      124       936K kmalloc-32
 11648   9384  80%    0.07K    208       56       832K anon_vma
  4448   4409  99%    0.12K    139       32       556K kmalloc-96
   207    191  92%    2.06K     69        3       552K sighand_cache
   992    905  91%    0.50K    124        8       496K kmalloc-512
 12078  11962  99%    0.04K    122       99       488K ext4_extent_status
  7616   7353  96%    0.06K    119       64       476K kmalloc-64
    29     29 100%   16.00K     29        1       464K kmalloc-16384
   678    618  91%    0.62K    113        6       452K sock_inode_cache
  2247   2154  95%    0.19K    107       21       428K kmalloc-192
  2848   2232  78%    0.12K     89       32       356K kmalloc-128
    11     11 100%   32.00K     11        1       352K kmalloc-32768
     1      1 100%  256.00K      1        1       256K kmalloc-262144
   248    192  77%    1.00K     62        4       248K signal_cache
  4399   4313  98%    0.05K     53       83       212K ftrace_event_field
   432    410  94%    0.44K     48        9       192K mnt_cache
   903    384  42%    0.19K     43       21       172K cred_jar
  2016   1913  94%    0.07K     36       56       144K Acpi-Operand
   126     95  75%    1.12K     18        7       144K mm_struct
  1380   1353  98%    0.09K     30       46       120K trace_event_file
   154     99  64%    0.69K     14       11       112K files_cache
   784    766  97%    0.14K     28       28       112K ext4_groupinfo_4k
   448    382  85%    0.12K     14       32        56K pid
    42     17  40%    1.06K      6        7        48K dmaengine-unmap-128
   187    159  85%    0.23K     11       17        44K cfq_queue
   408    269  65%    0.08K      8       51        32K Acpi-State
   272    166  61%    0.12K      8       34        32K flow_cache
    28     14  50%    0.81K      7        4        28K bdev_cache
    28     24  85%    0.94K      7        4        28K RAW
     9      5  55%    2.48K      3        3        24K request_queue
    12     10  83%    2.00K      6        2        24K TCP
   195     93  47%    0.10K      5       39        20K Acpi-ParseExt
    64     27  42%    0.25K      4       16        16K pool_workqueue
   252     99  39%    0.06K      4       63        16K fs_cache
    48     16  33%    0.32K      4       12        16K taskstats
     2      2 100%    6.81K      2        1        16K net_namespace
    57     15  26%    0.20K      3       19        12K file_lock_cache
   142     52  36%    0.05K      2       71         8K nsproxy
    50     20  40%    0.16K      2       25         8K sigqueue
   326    228  69%    0.02K      2      163         8K fsnotify_mark_connector
     3      1  33%    2.06K      1        3         8K dmaengine-unmap-256
   198    105  53%    0.04K      2       99         8K khugepaged_mm_slot
    72     16  22%    0.11K      2       36         8K jbd2_journal_head
    30      3  10%    0.13K      1       30         4K numa_policy
     5      1  20%    0.75K      1        5         4K dax_cache
    16      4  25%    0.25K      1       16         4K dquot
     6      2  33%    0.59K      1        6         4K hugetlbfs_inode_cache



On the Libre:

 Active / Total Objects (% used)    : 249932 / 256345 (97.5%)
 Active / Total Slabs (% used)      : 16818 / 16855 (99.8%)
 Active / Total Caches (% used)     : 73 / 116 (62.9%)
 Active / Total Size (% used)       : 72320.87K / 73386.62K (98.5%)
 Minimum / Average / Maximum Object : 0.02K / 0.29K / 4096.00K

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
  9240   9237  99%    1.05K   3080        3     12320K ext4_inode_cache
 16032  16012  99%    0.58K   2672        6     10688K inode_cache
 38052  38017  99%    0.19K   1812       21      7248K dentry
  9471   9471 100%    0.56K   1353        7      5412K radix_tree_node
 26580  26580 100%    0.20K   1329       20      5316K vm_area_struct
   281    281 100%   16.00K    281        1      4496K kmalloc-16384
 23648  22692  95%    0.12K    739       32      2956K kernfs_node_cache
   343    343 100%    7.62K    343        1      2744K task_struct
 23556  23554  99%    0.10K    604       39      2416K buffer_head
  9632   9364  97%    0.25K    602       16      2408K kmalloc-256
   239    239 100%    8.00K    239        1      1912K kmalloc-8192
   882    877  99%    2.00K    441        2      1764K kmalloc-2048
   422    385  91%    4.00K    422        1      1688K kmalloc-4096
  2478   2478 100%    0.65K    413        6      1652K proc_inode_cache
  1612   1556  96%    1.00K    403        4      1612K kmalloc-1024
    22     22 100%   64.00K     22        1      1408K kmalloc-65536
  1639   1629  99%    0.68K    149       11      1192K shmem_inode_cache
     8      8 100%  128.00K      8        1      1024K kmalloc-131072
     2      2 100%  512.00K      2        1      1024K kmalloc-524288
 16128  14947  92%    0.06K    252       64      1008K anon_vma_chain
 25544  25490  99%    0.03K    206      124       824K kmalloc-32
  1304   1169  89%    0.50K    163        8       652K kmalloc-512
    20     20 100%   32.00K     20        1       640K kmalloc-32768
  8904   8303  93%    0.07K    159       56       636K anon_vma
   171    168  98%    2.06K     57        3       456K sighand_cache
  6912   6753  97%    0.06K    108       64       432K kmalloc-64
 10395  10347  99%    0.04K    105       99       420K ext4_extent_status
  2142   2142 100%    0.19K    102       21       408K kmalloc-192
   600    600 100%    0.62K    100        6       400K sock_inode_cache
  2816   2696  95%    0.12K     88       32       352K kmalloc-96
  2272   2041  89%    0.12K     71       32       284K kmalloc-128
     1      1 100%  256.00K      1        1       256K kmalloc-262144
  1792   1757  98%    0.14K     64       28       256K ext4_groupinfo_4k
  4565   4362  95%    0.05K     55       83       220K ftrace_event_field
   208    173  83%    1.00K     52        4       208K signal_cache
   987    484  49%    0.19K     47       21       188K cred_jar
   205    171  83%    0.75K     41        5       164K drm_i915_gem_object
  1960   1833  93%    0.07K     35       56       140K Acpi-Operand
   315    312  99%    0.44K     35        9       140K mnt_cache
  1564   1442  92%    0.09K     34       46       136K trace_event_file
   210    142  67%    0.56K     30        7       120K task_group
   105     98  93%    1.12K     15        7       120K mm_struct
   121     94  77%    0.69K     11       11        88K files_cache
   416    365  87%    0.12K     13       32        52K pid
   357    295  82%    0.08K      7       51        28K Acpi-State
    21     14  66%    1.06K      3        7        24K dmaengine-unmap-128
    24     24 100%    0.94K      6        4        24K RAW
   195     93  47%    0.10K      5       39        20K Acpi-ParseExt
    10     10 100%    2.00K      5        2        20K TCP
    35     35 100%    0.56K      5        7        20K i915_request
    64     24  37%    0.25K      4       16        16K pool_workqueue
   252     94  37%    0.06K      4       63        16K fs_cache
     2      2 100%    6.81K      2        1        16K net_namespace


You should know that although I haven’t heard any recent reports,
Firefox has had a long history of memory leaks causing things like what you are describing (memory completely allocated).
Most of the time the problem wasn’t the Firefox app itself but poorly written plugins, and because of the permissive “open” policies that allowed their wide installation and use.
In some ways you can consider it a potential issue that happens when you also want an extremely large variety of plugins available.
Mozilla has gradually tightened up policies over time, but things happen.

Bottom line…
You may want to shutdown your Firefox browser often.
You may want to be careful about installed plugins, particularly those that support deprecated technologies like palemoon. When a base technology is no longer supported, then the various related technologies may also receive less attention. The whole ecology then may start to fail.

HTH,
TSU

I don’t leave Firefox open on these machines. I use it sometimes, close it and do a bleachbit afterwards.

There are other machines (mostly TW) with Firefox open for weeks, but there are no problems at all.

Next suggestion?

Another consideration is if your machine is very active.
Nowadays,
all OS will prefer to use all (or nearly all) physical resources before de-allocating so that the resources can be used elsewhere. Data in RAM is only marked for de-allocation and returned to a pool for re-use availability only when there is resource pressure.

Typically this works fine for most workloads… The idea is that as long as the data is already in RAM, if the data is re-accessed then it’s available otherwise computation to produce that result would have to happen again.

But,
Of course the above assumes a <normal> workload.
If the workload suddenly changes, then of course all that data in RAM is worthless.
Although the OS will “learn” your new workload has nothing to do with what was being done before, that can take time.
In this instance, you can manually flush your memory buffers, I provide that command at the end of my Wiki article on the Free tool (and if you’re using Free it might be worth a look to be sure you’re using the tool properly)

https://en.opensuse.org/User:Tsu2/free_tool

If you really feel that memory exhaustion is crashing your machine, the command I give you should address that as a short term fix. Of course, your memory may fill up again and then it’ll be your decision whether to run the command again or if you think the OS will resolve the issue on its own.

The other thing to look at is if you notice your crashes might be related to a particular app…
You mention VLC which has given me problems intermittently in the past…
From time to time I’ve had to tinker with how much data is buffered, it’s like a Goldilocks… Can’t be too large or too small, there is an amount that’s just right.
Less frequently, I’ve switched codecs from gstreamer to ffmpeg and vice versa.
VLC has been a temperamental app at times but not continuously for me.

No, it’s not “very active”. Both sitting there and doing the VNC or VLC jobs I described. Not more.

Please, something that matches the situation! If you need more details, OK, but I think I provided the relevant info…

I’m not willing to run any commands every other day, the machines should do things on their own (otherwise I could set up a cron job to reboot every night, but that would be a crude hack. I want to know what is going on here).

Sorry for my comment #9, used to “top” I didn’t realize immediately that the linked image showed “htop” with its odd (to me) way of reporting memory.
Nevertheless, the screenshot clearly shows that RAM is exhausted and the system is about to “crash”, but it doesn’t show why.
Finger pointing at Firefox doesn’t help (some 1% of reserved memory), Xvnc is much larger (some 1.4GB reserved + 1.3GB shared) but still manageable apparently.
The systems currently (in the last few posts) are far from memory starving, so the simple fact that RAM usage is increasing with uptime doesn’t tell much.
Maybe running “top” sorted by “%MEM” (try hitting > or <) can quickly show what processes use most RAM and if any of those dramatically increase their use with uptime.
Of course more experienced sysadmins may suggest other tools…

Hi Italy!

At the moment no system running short on RAM, so I will have to wait and see. Maybe the swap really solved the problem, without getting used at all? Heissenbug? I will report back…

To a point, simply setting up a swap improves memory allocation, even if it is never actually used.
But if there is a memory leak somewhere it will show up again; possibly having a swap might allow you to catch the culprit and kill it before the system crashes…
Good Luck!