Page 1 of 5 123 ... LastLast
Results 1 to 10 of 50

Thread: memtest+ questions

  1. #1

    Default memtest+ questions

    I installed memtest+ to the bootloader. When I run the test, it stops after about 5 seconds and restarts the computer. I assume this is not good and that it means it is failing big time? I would love a confirmation here.
    I want to rule out the possibility that it is simply just the program not working correctly.


    History:
    I went down this path because I have been trying to debug why my computer is freezing when playing games. I have a ryzen 1700 and a vega 56. I believe that I have ruled out the ryzen bug (relates to c-states and idle times) because it does not ever freeze when idle, only when playing games. Although I have not found a way to monitor the gpu temp, I can see the "tachometer" on it and it never gets above 3 out of 10. And the computer seems to run pretty cool in general, so I think it is not that.

    I then saw people talking about how ram could cause this, so hence the memtest. But, I also found out that there is apparently specific lists of ram that has been tested to work on motherboards (I did not know this was a thing, sadface) and mine is not there. There does appear to be a way in the bios to adjust the voltage going to the ram, but I am a little leery to muck around with that since I don't know what the heck I am doing (people have suggested bumping up to 1.4. It is currently set to 1.2).

    Any help is appreciated.

    BTW: I realize this has nothing to do with opensuse (I even have a new install, and the same behavior was on manjaro). But, I would appreciate any help anyway!

  2. #2
    Join Date
    Jun 2008
    Location
    Podunk
    Posts
    26,497
    Blog Entries
    15

    Default Re: memtest+ questions

    Hi
    Have you tried pulling the RAM sticks and re-inserting? Are they all matched and in the correct slots?

    After re-seating, run memtest again, all ok?

    If you boot into a linux operating system and run dmidecode to see the exact part number(s) on the ram modules.

    Code:
    dmidecode -t memory
    From the part numbers you can then check on the manufacturing specs to see what they need to be set at, both timing and voltage, then can look at the BIOS settings and adjust as required.

    Normally the system should detect, but maybe if the RAM is not of the specified ones, then the above tweaking may be required.

    Again run the memtest checks after tweaking, if all ok, then install prime95 and run the torture test to be sure....
    Cheers Malcolm °¿° SUSE Knowledge Partner (Linux Counter #276890)
    SUSE SLE, openSUSE Leap/Tumbleweed (x86_64) | GNOME DE
    If you find this post helpful and are logged into the web interface,
    please show your appreciation and click on the star below... Thanks!

  3. #3

    Default Re: memtest+ questions

    Quote Originally Posted by malcolmlewis View Post
    Hi
    Have you tried pulling the RAM sticks and re-inserting? Are they all matched and in the correct slots?

    After re-seating, run memtest again, all ok?

    If you boot into a linux operating system and run dmidecode to see the exact part number(s) on the ram modules.

    Code:
    dmidecode -t memory
    From the part numbers you can then check on the manufacturing specs to see what they need to be set at, both timing and voltage, then can look at the BIOS settings and adjust as required.

    Normally the system should detect, but maybe if the RAM is not of the specified ones, then the above tweaking may be required.

    Again run the memtest checks after tweaking, if all ok, then install prime95 and run the torture test to be sure....
    Thanks for the help and suggestions. I have not tried re-seating the ram. That is on the todo list and will do that next. I just in the last hour figured out how to get journalctl to look at past boots rather than just the current, so that is another thing that I will check if/when it crashes again. But, you are right. I should reseat immediately. I will do that now. Who knows, maybe I'll get lucky. In the meantime, for all for memory sticks, "dmidecode -t memory" gave an output such as this:

    Code:
    Handle 0x0037, DMI type 17, 40 bytes
    Memory Device
            Array Handle: 0x0027
            Error Information Handle: 0x0036
            Total Width: 64 bits
            Data Width: 64 bits
            Size: 8192 MB
            Form Factor: DIMM
            Set: None
            Locator: DIMM_B2
            Bank Locator: BANK 3
            Type: DDR4
            Type Detail: Synchronous Unbuffered (Unregistered)
            Speed: 2133 MT/s
            Manufacturer: G-Skill
            Serial Number: 00000000
            Asset Tag: Not Specified
            Part Number: F4-3200C16-8GTZSW
            Rank: 1
            Configured Memory Speed: 1067 MT/s
            Minimum Voltage: 1.2 V
            Maximum Voltage: 1.2 V
            Configured Voltage: 1.2 V
    The Error information handle caught my eye. Is the voltage listed here what it SHOULD be running at? or what it IS running at? Because that is the voltage listed on the spec sheet from g-skill. Also, forgive my ignorance, but what do you mean by timing for the ram? What should I be looking for here? (or is that the speed 2133 mt/s)?

    EDIT: actually the spec sheet has two listings for voltage. SPD voltage is 1.2. Tested voltage is 1.35.

  4. #4
    Join Date
    Jun 2008
    Location
    Podunk
    Posts
    26,497
    Blog Entries
    15

    Default Re: memtest+ questions

    Hi
    See the "Tested Latency" so timings in BIOS should be set to 16-16-16-36?

    I would check the seating of the RAM first and retest, then look at moving the voltage a little, say to 1.25v and test again.

    What are the spec for the CPU with regard to memory speed, on my system (intel) I see;

    Code:
    Memory Device
        Array Handle: 0x003F
        Error Information Handle: Not Provided
        Total Width: 64 bits
        Data Width: 64 bits
        Size: 4096 MB
        Form Factor: DIMM
        Set: None
        Locator: DIMM2
        Bank Locator: CHANNEL B SLOT1
        Type: DDR3
        Type Detail: Synchronous
        Speed: 1600 MT/s <==
        Manufacturer: Samsung
        Serial Number: 13B2DCED
        Asset Tag: 9876543210
        Part Number: M378B5173QH0-CK0  
        Rank: 1
        Configured Memory Speed: 1600 MT/s <==
    You also might want to check over-clocking forums as may pick up some snippets about your RAM and setup...

    It's all about little tweaks and lots of testing....
    Cheers Malcolm °¿° SUSE Knowledge Partner (Linux Counter #276890)
    SUSE SLE, openSUSE Leap/Tumbleweed (x86_64) | GNOME DE
    If you find this post helpful and are logged into the web interface,
    please show your appreciation and click on the star below... Thanks!

  5. #5
    Join Date
    Dec 2008
    Location
    FL, USA
    Posts
    1,570

    Default Re: memtest+ questions

    I suggest you run MemTest86 free version instead of memtest86+. The two are not the same thing. I haven't observed reliable operation from memtest86+ in many moons, except on DDR2 and older hardware.
    Reg. Linux User #211409 *** multibooting since 1992
    Primary: 42.3,TW,15.0 & 13.1 on Haswell w/ RAID
    Secondary: eComStation (OS/2)&42.3 on 965P/Radeon
    Tertiary: TW,15.0,42.3,Fedora,Debian,more on Kaby Lake,Q45,Q43,G41,G3X,965G,Cedar,Caicos,Oland,GT218&&&

  6. #6

    Default Re: memtest+ questions

    Quote Originally Posted by mrmazda View Post
    I suggest you run MemTest86 free version instead of memtest86+. The two are not the same thing. I haven't observed reliable operation from memtest86+ in many moons, except on DDR2 and older hardware.
    Okay, thanks. I took your advice and ran memtest86 over night. 48 tests, no errors. So, now I am not sure where to go, except maybe to keep fiddling with minor voltage increments. One of the games that is the most reliable to trigger the freeze is Hearts of Iron 4. It is not a graphically taxing game, but it does eat up a lot of memory, which still makes me think that it might have something to do with the ram -- but of course I don't really know. There is that bug I mentioned before about c-states at idle. Again, I do not THINK that it is that, but I still took precautions and disabled c-state management in the bios.

    Since this ram is not listed on the approved list for the motherboard, would I have better luck buying ram that IS on it, or would that be a complete waste of money? Again, any help or suggestions are greatly appreciated.

  7. #7

    Default Re: memtest+ questions

    So, this is the last few lines on the previous boot where I was playing Hearts of Iron 4 to trigger the crash:
    Code:
    Jun 30 08:04:47 linux-i7cx org_kde_powerdevil[8622]: powerdevil: Scheduling inhibition from ":1.82" "My SDL application" with cookie 29 and reason "Playing a game"
    Jun 30 08:04:47 linux-i7cx org_kde_powerdevil[8622]: powerdevil: Releasing inhibition with cookie  29
    Jun 30 08:04:47 linux-i7cx org_kde_powerdevil[8622]: powerdevil: It was only scheduled for inhibition but not enforced yet, just discarding it
    Jun 30 08:04:52 linux-i7cx org_kde_powerdevil[8622]: powerdevil: Enforcing inhibition from ":1.82" "My SDL application" with cookie 29 and reason "Playing a game"
    Jun 30 08:04:52 linux-i7cx org_kde_powerdevil[8622]: powerdevil: By the time we wanted to enforce the inhibition it was already gone; discarding it
    Jun 30 08:05:07 linux-i7cx org_kde_powerdevil[8622]: powerdevil: Scheduling inhibition from ":1.82" "My SDL application" with cookie 30 and reason "Playing a game"
    Jun 30 08:05:07 linux-i7cx org_kde_powerdevil[8622]: powerdevil: Releasing inhibition with cookie  30
    Jun 30 08:05:07 linux-i7cx org_kde_powerdevil[8622]: powerdevil: It was only scheduled for inhibition but not enforced yet, just discarding it
    Jun 30 08:05:12 linux-i7cx org_kde_powerdevil[8622]: powerdevil: Enforcing inhibition from ":1.82" "My SDL application" with cookie 30 and reason "Playing a game"
    Jun 30 08:05:12 linux-i7cx org_kde_powerdevil[8622]: powerdevil: By the time we wanted to enforce the inhibition it was already gone; discarding it
    I used the "journalctl -b -1 -n" command for this. Let me know if there is some other command to get more info or whatever. I don't really know what the heck I am doing lol. Anyway, plz let me know if this is relevant.


    EDIT: I looked back at the entire journalctl and this is the stuff that happened right before the "playing a game" messages:
    Code:
    Jun 30 07:56:17 linux-i7cx kwin_x11[8574]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 14566, resource id: 85983252, major code: 19 (DeleteProperty), minor code: 0
    Jun 30 07:56:17 linux-i7cx kwin_x11[8574]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 14570, resource id: 85983252, major code: 18 (ChangeProperty), minor code: 0
    Jun 30 07:56:17 linux-i7cx kwin_x11[8574]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 14576, resource id: 85983252, major code: 19 (DeleteProperty), minor code: 0
    Jun 30 07:56:17 linux-i7cx kwin_x11[8574]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 14577, resource id: 85983252, major code: 18 (ChangeProperty), minor code: 0
    Jun 30 07:56:17 linux-i7cx kwin_x11[8574]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 14578, resource id: 85983252, major code: 19 (DeleteProperty), minor code: 0
    Jun 30 07:56:17 linux-i7cx kwin_x11[8574]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 14579, resource id: 85983252, major code: 19 (DeleteProperty), minor code: 0
    Jun 30 07:56:17 linux-i7cx kwin_x11[8574]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 14580, resource id: 85983252, major code: 19 (DeleteProperty), minor code: 0
    Jun 30 07:56:17 linux-i7cx kwin_x11[8574]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 14581, resource id: 85983252, major code: 7 (ReparentWindow), minor code: 0
    Jun 30 07:56:17 linux-i7cx kwin_x11[8574]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 14582, resource id: 85983252, major code: 6 (ChangeSaveSet), minor code: 0
    Jun 30 07:56:17 linux-i7cx kwin_x11[8574]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 14583, resource id: 85983252, major code: 2 (ChangeWindowAttributes), minor code: 0
    Jun 30 07:56:17 linux-i7cx kwin_x11[8574]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 14584, resource id: 85983252, major code: 10 (UnmapWindow), minor code: 0

  8. #8
    Join Date
    Aug 2010
    Location
    Chicago suburbs
    Posts
    12,352
    Blog Entries
    3

    Default Re: memtest+ questions

    Quote Originally Posted by shawnsterp View Post
    I looked back at the entire journalctl and this is the stuff that happened right before the "playing a game" messages:
    Oh, something like:
    Code:
    2019-06-30T07:38:50.279355-05:00 nwr2 kwin_x11[2735]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 10587, resource id: 117440516, major code: 18 (ChangeProperty), minor code: 0
    My logs have many messages like that. I am ignoring them, because they do not seem to indicate any problem (other than flooding logs with unimportant messages).
    openSUSE Leap 15.1; KDE Plasma 5;

  9. #9
    Join Date
    Sep 2013
    Location
    Norfolk, UK
    Posts
    1,162

    Default Re: memtest+ questions

    Quote Originally Posted by nrickert View Post
    My logs have many messages like that. I am ignoring them, because they do not seem to indicate any problem (other than flooding logs with unimportant messages).
    That's "sort of normal" ...

    That error message is rather a red herring, it's issued when a window disappears unexpectedly (generally) as a result of something else going belly up or otherwise nuking itself into oblivion.
    Regards, Paul

    Tumbleweed (Snapshot: 20190814) KDE Plasma 5 ~~~
    Non-Tumbling Tumblweed (20150508) KDE 4 - Resurrected
    Leap 15.0 KDE Plasma 5 ~~~ Leap 15.1 KDE Plasma 5 (Work in progress...)

  10. #10

    Default Re: memtest+ questions

    I failed to realize till just now that "journalctl -b -1 -n" is only showing logs for BOOT. Since the system is booting just fine, I assume this is not helpful. Using yast, I looked at the systemd journal. Here is an entry that I believe may be relevant, as it surfaced the last time my system crashed:
    Code:
    BUG: unable to handle kernel paging request at fffff7ffb4e3dd80
    .

    However, over the last few days there have been a few other messages that are similar yet different:
    Code:
    BUG: Bad rss-counter state mm:00000000fc5397a5 idx:2 val:-4
    BUG: Bad page map in process X  pte:80000007d1b82876 pmd:7b171c067
    No idea what this means.

    BTW, I have reseated the ram at this point. I never got around to mentioning that.

Page 1 of 5 123 ... LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •