oom_killer - kernel-desktop-2.6.37.6-0.7.1

Hi,

not sure where this one belongs, its not really application but also not install/boot/login, but here goes…

Sunday aug 07 I upgraded to kernel-desktop-2.6.37.6-0.7.1, monday aug 08 I got my 1st ever oom_killer, which eventually send me back to the login prompt but even after logging in KDE was just not right, reboot was needed.

Now I am not sure if the oom_killer and the kernel update are linked or not, but have not had this issue before so I am thinking it might. I have not yet gone back to a previous version but I have been checking /var/log/messages and am having a hard time to fully understand the oom_killer dump. My system has 6GB and 2GB swap but for the life of me I can’t get the oom_killer log of all the processes up to 6GB, also swap is still not in use when this happens if I read the logs correctly:

Aug 16 20:21:09 linuxb1 kernel: [13006.654904] 169824 total pagecache pages
Aug 16 20:21:09 linuxb1 kernel: [13006.654906] 0 pages in swap cache
Aug 16 20:21:09 linuxb1 kernel: [13006.654908] Swap cache stats: add 0, delete 0, find 0/0
Aug 16 20:21:09 linuxb1 kernel: [13006.654911] Free swap = 2104476kB
Aug 16 20:21:09 linuxb1 kernel: [13006.654913] Total swap = 2104476kB
Aug 16 20:21:09 linuxb1 kernel: [13006.682135] 1834992 pages RAM
Aug 16 20:21:09 linuxb1 kernel: [13006.682137] 1609218 pages HighMem
Aug 16 20:21:09 linuxb1 kernel: [13006.682138] 306061 pages reserved
Aug 16 20:21:09 linuxb1 kernel: [13006.682139] 307003 pages shared
Aug 16 20:21:09 linuxb1 kernel: [13006.682140] 489941 pages non-shared

I hat to cut last nights log in 2:

oom_killer part 1 - Pastebin.com
oom_killer part 2 - Pastebin.com

Any help like how to read the oom_killer dump is appreciated, I would love to be able to reproduce it, but seems random, and will switch back a kernel version if I can’t solve it to see if that helps or not.

-Xil

know its a relatively old post, but I finally figured out the trigger of this and it’s Thunderbird running, it can happen right after starting Thunderbird or after its been running/sitting idle for hours, but not Thunderbird means no oom_killer and thunderbird running will eventually make it happen. Newer version like the 6.0 and 7.0 make it happen more often then 3.x.

Still not sure why, there is plenty of memory but at least finding the trigger it a small step forward.

On 2011-11-23 18:06, Xilanaz wrote:
>
> know its a relatively old post, but I finally figured out the trigger of
> this and it’s Thunderbird running, it can happen right after starting
> Thunderbird or after its been running/sitting idle for hours, but not
> Thunderbird means no oom_killer and thunderbird running will eventually
> make it happen. Newer version like the 6.0 and 7.0 make it happen more
> often then 3.x.

Didn’t see it at the time. OOM means out of memory, and the kernel in that
situation kill processes till it gets memory.

I suppose the dump you posted should give info on the culprit, but I
haven’t seen one of that in years.

Thunderbird is listed there, but it is not using that much memory.


> Aug 16 20:21:09 linuxb1 kernel: [13006.682142]  pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
....
> Aug 16 20:21:09 linuxb1 kernel: [13006.682477] [18013]  1000 18013     1129      335   0       0             0 thunderbird

In fact it says it is going to kill amarok:


> Aug 16 20:21:09 linuxb1 kernel: [13006.682355]  2634]  1000  2634   159597    65700   1       0             0 amarok
....
> Aug 16 20:21:09 linuxb1 kernel: [13006.682490] Out of memory: Kill process 2634 (amarok) score 32 or sacrifice child
> Aug 16 20:21:09 linuxb1 kernel: [13006.682495] Killed process 2634 (amarok) total-vm:638388kB, anon-rss:209092kB, file-rss:53708kB


Amarok uses more memory. It also kill more processes. Search for the word
“Killed” and you find them.

> Still not sure why, there is plenty of memory but at least finding the
> trigger it a small step forward.

Thunderbird can use a lot of memory. You can have in a terminal “top”
running, sorted my 'M’emory and find the runaway process. For example,
MailWasher.exe is using “total-vm:1620852kB” which is a lot.


Cheers / Saludos,

Carlos E. R.
(from 11.4 x86_64 “Celadon” at Telcontar)

How much RAM do you have ? 6Go ?

thanks for the answers.

Yes 6GB

Yes I know what OOM means and also noticed that amarok uses more at that stage, but after all this time its clear it only happens when thunderbird runs. The killing of oom just picks the highest target at that time (which makes sense) but this can be amarok, opera, rift etc. etc. it simply keeps killing till everything is gone and your back at the kde login screen, its just an avalanche of kills and none seem to free up enough. And does not really solve it, even an init 3, letting the system come to rest and then an init 5 does not fully solve it, the kde desktop is a mess, things are horribly slow, only init 6 fixes it for sure :slight_smile:

Mailwasher is I think so big because of its wine dependency (can’t find a good alternative yet for mailwasher)

From what I can gather from the start of the dump is that swap (2GB) is free and the 6GB main memory is really not full.

Every log I checked after a oom has thunderbird running, no log does not show it running, these days I just open, check mail and close it :slight_smile: it does not happen daily, sometimes weeks go by, sometimes 2 days. So right now am just homing in on the plugins in thunderbird, turned those all off and see if there is a difference… and looking into kmail or other alternative mail clients :slight_smile:

I read here and there info about oom and maybe I got it all wrong but feels to me that there is a certain part of base memory that suddenly gets filled up and despite enough memory being free higher up the whole thing panics and oom gets started. But its hard to find really good info on the oom logs, they probably bursting with info but more for a techy developer then a user :slight_smile:

the lowmem_reserve feels like the key but could be totally wrong.

On 2011-11-24 20:36, Xilanaz wrote:

> Mailwasher is I think so big because of its wine dependency (can’t
> find a good alternative yet for mailwasher)

Typically on Linux you would use spamassassin. It is quite good, although
you have to download the spam.

> From what I can gather from the start of the dump is that swap (2GB) is
> free and the 6GB main memory is really not full.

Swap is not used in your system, even when loaded? Perhaps you should
investigate why.

> Every log I checked after a oom has thunderbird running, no log does
> not show it running, these days I just open, check mail and close it :slight_smile:
> it does not happen daily, sometimes weeks go by, sometimes 2 days. So
> right now am just homing in on the plugins in thunderbird, turned those
> all off and see if there is a difference…

Dunno. It works fine for many people. But it is a memory hog.


>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  1766 cer       20   0 1927m 1.1g  20m S    1 13.5  10:03.62 firefox-bin
>  2336 cer       20   0  958m 189m  19m S    0  2.4   6:07.91 thunderbird-bin

> and looking into kmail or
> other alternative mail clients :slight_smile:

Kmail right now is not in the best of forms, it has big problems. At least
when updating.

> I read here and there info about oom and maybe I got it all wrong but
> feels to me that there is a certain part of base memory that suddenly
> gets filled up and despite enough memory being free higher up the whole
> thing panics and oom gets started. But its hard to find really good
> info on the oom logs, they probably bursting with info but more for a
> techy developer then a user :slight_smile:

I guess so.


Cheers / Saludos,

Carlos E. R.
(from 11.4 x86_64 “Celadon” at Telcontar)