Machine Check on boot

Can anyone explain this? The MCE shows up near the start of every boot. The thing that bothers me is that the address fef87380 is in the MMIO region. I wonder if the kernl is probing this as DRAM causing the MCE.


Hardware event. This is not a software error.
MCE 1
CPU 0 BANK 6 
MISC 278a0000086 ADDR fef87380 
TIME 1506325119 Mon Sep 25 00:38:39 2017
MCG status:
MCi status:
Uncorrected error
MCi_MISC register valid
MCi_ADDR register valid
Processor context corrupt
MCA: corrected filtering (some unreported errors in same region)
Generic CACHE Level-2 Generic Error
STATUS ae0000000040110a MCGSTATUS 0
MCGCAP c07 APICID 0 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 69

Like the message says, a hardware event. Given the processor corruption I’d run memtest.

I started memtest86+ last night, its made 2+ passes in single core mode so far with no errors. I just now restarted it in mullti-core mode

Eeerh, what you didn’t tell us is whether the machine behaves ‘normally’. i.e. boots to the desired target.

I get similar mce: logs on every boot on my new laptop (Dell Inspiron 15, i5-7200U CPU) with both TW and Leap. Googling ‘mce: TSC linux’ shows it’s a common problem, but not very serious. I’ve never had any other faults after booting, and memtest running overnight shows nothing.

One thing to check is if you have the latest BIOS/UEFI firmware. I updated mine(from 1.1.3 to 1.1.5) with no solution, but other comments have said there are fixes available for some machines.

In my case the MCE seemed to have no effect for awhile (I had been seeing it on every boot for several months at least), but then yesterday I saw that it wasn’t able to start X, instead it would boot to the console login prompt and sit there blinking rapidly. If i reboot to runlevel 4 or less the machine works ok, but can not get to X (runlevel 5) One thing i noticed was that the memory address mentioned i the MCE of fef87380 is not covered by the BIOS e820 info.


    0.000000] e820: BIOS-provided physical RAM map:
    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009d3ff] usable
    0.000000] BIOS-e820: [mem 0x000000000009d400-0x000000000009ffff] reserved
    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000a56affff] usable
    0.000000] BIOS-e820: [mem 0x00000000a56b0000-0x00000000a5eaffff] reserved
    0.000000] BIOS-e820: [mem 0x00000000a5eb0000-0x00000000aaabefff] usable
    0.000000] BIOS-e820: [mem 0x00000000aaabf000-0x00000000aaebefff] reserved
    0.000000] BIOS-e820: [mem 0x00000000aaebf000-0x00000000aafbefff] ACPI NVS
    0.000000] BIOS-e820: [mem 0x00000000aafbf000-0x00000000aaffefff] ACPI data
    0.000000] BIOS-e820: [mem 0x00000000aafff000-0x00000000aaffffff] usable
    0.000000] BIOS-e820: [mem 0x00000000ab000000-0x00000000af9fffff] reserved
    0.000000] BIOS-e820: [mem 0x00000000e0000000-0x00000000efffffff] reserved
    0.000000] BIOS-e820: [mem 0x00000000fe101000-0x00000000fe112fff] reserved
    0.000000] BIOS-e820: [mem 0x00000000feb00000-0x00000000feb0ffff] reserved
    0.000000] BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
    0.000000] BIOS-e820: [mem 0x00000000fed00000-0x00000000fee00fff] reserved
should be a 'reserved'  entry covering fef87380 here
    0.000000] BIOS-e820: [mem 0x00000000ffc00000-0x00000000ffffffff] reserved
    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000024f5fffff] usable

I ran the Passmark memtest86 last night, two full runs of 4 passes each and no issues found,. I than ran a live CD and that was fine, so I reformatted and installed Leap 42.3 from scratch. Its worked fine other than a few glitches that i fixed. so, my conclusion is that the MCE was unrelated to my issue, I guess it was something in the software that got buggered.