Results 1 to 5 of 5

Thread: Diagnosing system hangs

  1. #1
    Join Date
    Dec 2009
    Location
    Prague
    Posts
    96

    Default Diagnosing system hangs

    I've build new computer and installed openSUSE 13.1 on it. It has Intel DH87MC motherboard and Intel Core i7-4770K. Problem is that it sometimes (after 10-30 hours) hangs, it responds to ping, but that's all, ssh doesn't answer, apache doesn't answer, input from keyboard is ignored, power button only wakes up from screensaver. Only thing that can be done is reset. There are no messages in log files. And sometimes, it hangs just for while and shows messages like: kernel:[ 1594.865752] BUG: soft lockup - CPU#1 stuck for 22s! [btrfs:3134].
    I've tried to switch off everything unnecessary in BIOS, I've tried new kernel from kernel/stable, but nothing helped. I'm suspecting that there is something wrong with motherboard, but I'd like to be more sure before buying new one. So my question is: Are there some ways how to diagnose what's causing the hangs?

  2. #2
    Join Date
    Dec 2009
    Location
    Prague
    Posts
    96

    Default Re: Diagnosing system hangs

    And one other thing, it has 16GB RAM and I haven't created any swap partition for it, can the absence of swap be the problem?

  3. #3
    Join Date
    Jun 2008
    Location
    Kansas City Area, Missouri, USA
    Posts
    7,236

    Default Re: Diagnosing system hangs

    On 03/02/2014 05:06 AM, mjakl wrote:
    >
    > And one other thing, it has 16GB RAM and I haven't created any swap
    > partition for it, can the absence of swap be the problem?


    The problem is not likely due to the absence of swap. That said, I never think
    it is a good idea to forgo swap, no matter how much RAM you have. Without swap,
    if you ever run out of RAM, the out-of-memory killer will just quietly discard
    some running program. If some user program has a runaway leak, it will just
    disappear, and you will never know why. The second factor is that the scheduler
    works better if there is swap available. Create a swap file - you do not need a
    swap partition.

    Before you condemn the MB, run a long memory test (~24 hours).

    When the lockups occur, is the offending component always btrfs as in the
    example you quoted? Not very many of us use that fs, and I am not sure how much
    testing it gets.



  4. #4
    Join Date
    Dec 2009
    Location
    Prague
    Posts
    96

    Default Re: Diagnosing system hangs

    Quote Originally Posted by lwfinger View Post
    On 03/02/2014 05:06 AM, mjakl wrote:
    >
    > And one other thing, it has 16GB RAM and I haven't created any swap
    > partition for it, can the absence of swap be the problem?


    The problem is not likely due to the absence of swap. That said, I never think
    it is a good idea to forgo swap, no matter how much RAM you have. Without swap,
    if you ever run out of RAM, the out-of-memory killer will just quietly discard
    some running program. If some user program has a runaway leak, it will just
    disappear, and you will never know why. The second factor is that the scheduler
    works better if there is swap available. Create a swap file - you do not need a
    swap partition.

    Before you condemn the MB, run a long memory test (~24 hours).

    When the lockups occur, is the offending component always btrfs as in the
    example you quoted? Not very many of us use that fs, and I am not sure how much
    testing it gets.
    As you noticed, I'm using btrfs, so creating swap file is bit problematic, btrfs doesn't support swapfiles, even if mounted with nodatacow. But I'll put there another disk soon, so I'll make swap partition there.
    It's not always btrfs, I've saw there also tvheadend and php. It was btrfs mostly, but it could be just because it was working heavily, I was copying about 2TB of data.
    Good idea with the memtest, I'll give it a try.

  5. #5
    Join Date
    Dec 2009
    Location
    Prague
    Posts
    96

    Default Re: Diagnosing system hangs

    So it looks like that it's btrfs problem. This bug entry describes the same symptoms https://bugzilla.novell.com/show_bug.cgi?id=864430, so it'll be fixed sooner or later. I'm glad that it's not motherboard issue.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •