Results 1 to 7 of 7

Thread: SSD write failures

  1. #1

    Question SSD write failures

    Ok, bit of a story here but I'll try to keep it brief.

    A few days ago one of my desktops (openSUSE 42.1 Leap) started to behave erratically - several disk errors being displayed at boot, and I usually ended up in grub-rescue mode.
    After a few failed rescue attempts the whole thing just crashed and I decided to just wipe the disk and start fresh (my data is on another drive anyway).
    The OS disk is a Samsung 850 EVO SSD drive (250Gb).

    So here's the problem: I can't install anything on that disk anymore. Here's what happens every time:
    1. I boot openSUSE 42.2 Leap live from USB flash drive (or DVD, same thing further on)
    2. Start install procedure and fill in all the data
    3. During "copying system files" and "installing packages" it always hangs, at random percentages


    Now I've run a few tests on this disk with Gparted (badblocks, blkid...) and no errors are found. Haven't run a full read-write test with badblocks yet 'cause it seems to run for days on end.... Only 10% after 7 hours or so.
    Deleting and creating partitions isn't a problem, but everytime I try to install an OS on it - tried 42.1 & 42.2 + GeckoLinux - it just 'hangs'.

    What on Earth could be causing this ?

    TIA
    Bart.

  2. #2
    Join Date
    Jun 2008
    Location
    Groningen, Netherlands
    Posts
    19,803
    Blog Entries
    14

    Default Re: SSD write failures

    I've seen sort of this on the SSD of my server at the time 13.2 was released. Tried everything, new partition tables, MBR, GPT. In the end it appeared to be the controller inside the SSD that had gone bonkers.
    ° Appreciate my reply? Click the star and let me know why.

    ° Perfection is not gonna happen. No way.

    https://en.opensuse.org/openSUSE:Board#Members
    http://en.opensuse.org/User:Knurpht
    http://nl.opensuse.org/Gebruiker:Knurpht

  3. #3

    Default Re: SSD write failures

    Hi

    Quote Originally Posted by bartvaes View Post
    Ok, bit of a story here but I'll try to keep it brief.

    A few days ago one of my desktops (openSUSE 42.1 Leap) started to behave erratically - several disk errors being displayed at boot, and I usually ended up in grub-rescue mode.
    After a few failed rescue attempts the whole thing just crashed and I decided to just wipe the disk and start fresh (my data is on another drive anyway).
    The OS disk is a Samsung 850 EVO SSD drive (250Gb).
    This is quite new hardware.
    I have a Samsung 850 pro ssd, which seems to have a longer lifetime, but anyway.

    Quote Originally Posted by bartvaes View Post
    So here's the problem: I can't install anything on that disk anymore. Here's what happens every time:
    1. I boot openSUSE 42.2 Leap live from USB flash drive (or DVD, same thing further on)
    2. Start install procedure and fill in all the data
    3. During "copying system files" and "installing packages" it always hangs, at random percentages


    Now I've run a few tests on this disk with Gparted (badblocks, blkid...) and no errors are found. Haven't run a full read-write test with badblocks yet 'cause it seems to run for days on end.... Only 10% after 7 hours or so.
    I don't think that it is a good idea to run badblocks on that disk, at least not in write mode.
    Because the blocks may (and usually will) not be physically located one after another, even if it may appear like that to the file system.
    The reason is that the controller of the ssd usually will try to write the blocks such that wear is more or less the same to all storage of the ssd (wear-leveling).

    And running badblocks on a ssd in write mode will just cause further wear, without providing much clues.
    I think this will be very different compared to running badblocks on a conventional hard disk drive.

    Quote Originally Posted by bartvaes View Post
    Deleting and creating partitions isn't a problem, but everytime I try to install an OS on it - tried 42.1 & 42.2 + GeckoLinux - it just 'hangs'.
    When the ssd 'thinks' that almost all memory on the ssd is allocated, then the write rate breaks down.

    You could try two things:

    Firstly, you could create one large partition on that disk taking the whole disk space (using e.g. gparted).
    Now if the disk is /dev/sdX, then that partition is /dev/sdX1.
    Mount that partition.
    Then (as root) run

    Code:
    fstrim -v /dev/sdX1
    in order to tell the controller of the ssd that all the space on the ssd isn't allocated.

    Perhaps this already helps.

    Secondly, (as root) run

    Code:
    smartctl -a /dev/sdX
    where again /dev/sdX points to your ssd.
    You will get more information about the health of your ssd by that.

    Good luck
    Mike

  4. #4

    Default Re: SSD write failures

    And formatting the ssd with ext4 instead of btrfs may be a good idea.

  5. #5
    Join Date
    Jun 2008
    Location
    San Diego, Ca, USA
    Posts
    11,144
    Blog Entries
    2

    Default Re: SSD write failures

    Whenever anyone deploys a new SSD,
    One of the first things that should be done is to inspect the Arch Linux Wiki on SSD drives.
    It's kept very current on developments.

    One of the things that caught my eye is that Samsung has a long history of issues with SSD drives for various models.
    Currently there is a recommended firmware upgrade to support booting from USB drives which sounds like it applies to you.

    https://wiki.archlinux.org/index.php/Solid_State_Drives

    TSU
    Beginner Wiki Quickstart - https://en.opensuse.org/User:Tsu2/Quickstart_Wiki
    Solved a problem recently? Create a wiki page for future personal reference!
    Learn something new?
    Attended a computing event?
    Post and Share!

  6. #6

    Default Re: SSD write failures

    You could update the BIOS and SSD's firmware, if new new versions are available.

    Running badblocks in write mode is quite safe. I have run it on many SSDs and USB flash drives. Always got the right answer in terms of "broken or not".

    10% of a 250 GB hard drive in 7 hours indicates an error. That is enough of an answer.

    Most likely, the SSD is defective. There is some chance that the motherboard and the SSD don't understand each other or that motherboard's controller is malfunctioning. You can check this by attempting to install 64 bit Windows, by running Samsung's Magician, by using the SSD on another motherboard, by installing another similar drive from Samsung on the same motherboard.

    The chance that something else is broken, like a cable or power supply, is very small.

    I wouldn't trust that SSD any more, even if a firmware update or some sort of reset brings the functionality back. Return if you can.

  7. #7

    Default Re: SSD write failures

    Thanks for the replies and insights everyone.

    Arch wiki is indeed a very good resource and often appears highly on Google searches. Not sure if it fully applied to my case as it's not a USB drive.
    Also, just went through my purchases and this drive is slightly less than a year old - might as well try to return it...
    For the moment I replaced it with another (Samsung) SSD which went without any problems, hope this one lasts longer


    Bart.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •