Harddrive issues with OpenSUSE: disk contents gone after reboot

A couple of weeks ago, I received a new HP Workstation Z820. The plan was to switch to Linux — as a long-time Windows user, I’ve been waiting for an opportunity to make the switch :slight_smile: I’ve been playing with OpenSUSE, and although I really like what I’ve seen thus far, I can’t get it to work properly on the machine. Please bear with me, as I’m a rather novice Linux user…

The machine has two harddrives: a 128 GB SSD, plus a 1 TB disk for data storage.

I succesfully installed OpenSUSE on the SSD. However, I’ve run into recurrent issues with mounting the 1 TB disk. I can repartition it and mount it, put some files on it, and everything seems fine. However, when I unmount the disk (or reboot the PC), and re-mount it, its contents seem scrambled: I get errors that “no partition table can be found”, or that the partition is damaged. I can then repartition the drive, reformat it, and work with it, until the next remount/reboot.

The disk itself seems fine: I put it in a different PC (running Windows 7), where it worked fine. I also tried installing RedHat on the workstation, which didn’t seem to encounter any issues with the disk.

Am I missing any drivers? Any configuration options I could change somewhere? Does this problem sound familiar to anyone at all?

I’ve tried to gather some extra information about the hardware, btw, using some utilities I googled. I put the resulting information online in my Dropbox: https://www.dropbox.com/sh/b64smcgaxthnaml/QswH5sCHWv . The most relevant are probably the “hdparm” and “smartctl” outputs for the offending harddrive, which I’ll paste below:



/dev/sdb:


ATA device, with non-removable media
    Model Number:       ST1000DM003-1CH162                      
    Serial Number:      Z1D4C96K            
    Firmware Revision:  HP33    
    Transport:          Serial, SATA Rev 3.0
Standards:
    Used: unknown (minor revision code 0x0029) 
    Supported: 9 8 7 6 5 
    Likely used: 9
Configuration:
    Logical        max    current
    cylinders    16383    16383
    heads        16    16
    sectors/track    63    63
    --
    CHS current addressable sectors:   16514064
    LBA    user addressable sectors:  268435455
    LBA48  user addressable sectors: 1953525168
    Logical  Sector size:                   512 bytes
    Physical Sector size:                  4096 bytes
    Logical Sector-0 offset:                  0 bytes
    device size with M = 1024*1024:      953869 MBytes
    device size with M = 1000*1000:     1000204 MBytes (1000 GB)
    cache/buffer size  = unknown
    Form Factor: 3.5 inch
    Nominal Media Rotation Rate: 7200
Capabilities:
    LBA, IORDY(can be disabled)
    Queue depth: 32
    Standby timer values: spec'd by Standard, no device specific minimum
    R/W multiple sector transfer: Max = 16    Current = ?
    Advanced power management level: 128
    Recommended acoustic management value: 208, current value: 0
    DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5 
         Cycle time: min=120ns recommended=120ns
    PIO: pio0 pio1 pio2 pio3 pio4 
         Cycle time: no flow control=120ns  IORDY flow control=120ns
Commands/features:
    Enabled    Supported:
       *    SMART feature set
            Security Mode feature set
       *    Power Management feature set
       *    Write cache
       *    Look-ahead
       *    WRITE_BUFFER command
       *    READ_BUFFER command
       *    DOWNLOAD_MICROCODE
       *    Advanced Power Management feature set
       *    48-bit Address feature set
       *    Device Configuration Overlay feature set
       *    Mandatory FLUSH_CACHE
       *    FLUSH_CACHE_EXT
       *    SMART error logging
       *    SMART self-test
       *    General Purpose Logging feature set
       *    64-bit World wide name
       *    WRITE_UNCORRECTABLE_EXT command
       *    {READ,WRITE}_DMA_EXT_GPL commands
       *    Segmented DOWNLOAD_MICROCODE
       *    Gen1 signaling speed (1.5Gb/s)
       *    Gen2 signaling speed (3.0Gb/s)
       *    unknown 76[3]
       *    Native Command Queueing (NCQ)
       *    Phy event counters
       *    unknown 76[15]
       *    DMA Setup Auto-Activate optimization
            Device-initiated interface power management
       *    Software settings preservation
       *    SMART Command Transport (SCT) feature set
       *    SCT Long Sector Access (AC1)
       *    SCT Error Recovery Control (AC3)
       *    SCT Features Control (AC4)
       *    SCT Data Tables (AC5)
            unknown 206[12] (vendor specific)
            unknown 206[13] (vendor specific)
Security: 
    Master password revision code = 65534
        supported
    not    enabled
    not    locked
    not    frozen
    not    expired: security count
        supported: enhanced erase
    110min for SECURITY ERASE UNIT. 110min for ENHANCED SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 5000c5005097dd32
    NAA        : 5
    IEEE OUI    : 000c50
    Unique ID    : 05097dd32
Checksum: correct



smartctl 5.43 2012-06-30 r3573 [x86_64-linux-2.6.32-343.el6.x86_64] (local build)Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net


=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda (SATA 3Gb/s, 4K Sectors)
Device Model:     ST1000DM003-1CH162
Serial Number:    Z1D4C96K
LU WWN Device Id: 5 000c50 05097dd32
Firmware Version: HP33
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 4
Local Time is:    Thu Jun  6 11:51:16 2013 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled


=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED


General SMART Values:
Offline data collection status:  (0x00)    Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:         (  584) seconds.
Offline data collection
capabilities:              (0x53) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      ( 113) minutes.
SCT capabilities:            (0x303b)    SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.


SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   117   100   006    Pre-fail  Always       -       154364464
  3 Spin_Up_Time            0x0023   098   098   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       61
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002f   100   253   030    Pre-fail  Always       -       227045
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       43
 10 Spin_Retry_Count        0x0033   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       61
180 Unused_Rsvd_Blk_Cnt_Tot 0x002a   100   100   000    Old_age   Always       -       327024441
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0033   100   100   097    Pre-fail  Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   069   066   045    Old_age   Always       -       31 (Min/Max 26/31)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       60
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       302
194 Temperature_Celsius     0x0022   031   040   000    Old_age   Always       -       31 (0 20 0 0 0)
196 Reallocated_Event_Count 0x0032   100   100   036    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0


SMART Error Log Version: 1
No Errors Logged


SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%        43         -
# 2  Extended offline    Interrupted (host reset)      00%         2         -


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.



Any help would be very much appreciated. Thanks in advance!

On 2013-06-06 12:26, onnodb wrote:
> Any help would be very much appreciated. Thanks in advance!

Run the smartctl long test. There is an interrupted test there.


Cheers / Saludos,

Carlos E. R.
(from oS 12.3 “Dartmouth” GM (rescate 1))

onnodb wrote:
> A couple of weeks ago, I received a new HP Workstation Z820. The plan
> was to switch to Linux — as a long-time Windows user, I’ve been
> waiting for an opportunity to make the switch :slight_smile: I’ve been playing with
> OpenSUSE, and although I really like what I’ve seen thus far, I can’t
> get it to work properly on the machine. Please bear with me, as I’m a
> rather novice Linux user…
>
> The machine has two harddrives: a 128 GB SSD, plus a 1 TB disk for data
> storage.
>
> I succesfully installed OpenSUSE on the SSD. However, I’ve run into
> recurrent issues with mounting the 1 TB disk. I can repartition it and
> mount it, put some files on it, and everything seems fine. However, when
> I unmount the disk (or reboot the PC), and re-mount it, its contents
> seem scrambled: I get errors that “no partition table can be found”, or
> that the partition is damaged. I can then repartition the drive,
> reformat it, and work with it, until the next remount/reboot.

Please post the entire command sequence you use, together with error
messages, prompts etc. Specifically including the umount and subsequent
mount, and use the -v option with those commands.

> The disk itself seems fine: I put it in a different PC (running
> Windows 7), where it worked fine. I also tried installing RedHat on the
> workstation, which didn’t seem to encounter any issues with the disk.
>
> Am I missing any drivers? Any configuration options I could change
> somewhere? Does this problem sound familiar to anyone at all?
>
> I’ve tried to gather some extra information about the hardware, btw,
> using some utilities I googled. I put the resulting information online
> in my Dropbox: https://www.dropbox.com/sh/b64smcgaxthnaml/QswH5sCHWv .
> The most relevant are probably the “hdparm” and “smartctl” outputs for
> the offending harddrive, which I’ll paste below:

Sorry, dropbox requires javascript, so I’ve no idea what you posted
there. Please post the new output here.

Ha Onno, welkom !!

I’d like to see some output, since I have some doubts about how you mount the disk.
Open a terminal window and do:


cat /etc/fstab
su -c 'fdisk -l'

I don’t know if you have your setup ready, but I’d do something like this:
SSD:
4 GB swap
30 GB for /
rest for /home
HDD
Folders for Music, Video, Photos etc, symlinks to those in your user’s homedir instead of the default folders.

But the disk should be properly mounted before.

Thanks for the replies!

Run the smartctl long test. There is an interrupted test there.

Will do so! I’ll try to post back later today with the results.

$ cat /etc/fstab
/dev/disk/by-id/ata-MTFDDAK128MAM-1J1_1310092E9F7A-part2 /                    ext4       acl,user_xattr        1 1
/dev/disk/by-id/ata-ST1000DM003-1CH162_Z1D4C96K-part1 /data                ext4       noauto,acl,user_xattr 0 0
/dev/disk/by-id/ata-MTFDDAK128MAM-1J1_1310092E9F7A-part1 swap                 swap       defaults              0 0
/dev/disk/by-id/ata-MTFDDAK128MAM-1J1_1310092E9F7A-part3 /home                ext4       acl,user_xattr        1 2
proc                 /proc                proc       defaults              0 0
sysfs                /sys                 sysfs      noauto                0 0
debugfs              /sys/kernel/debug    debugfs    noauto                0 0
usbfs                /proc/bus/usb        usbfs      noauto                0 0
devpts               /dev/pts             devpts     mode=0620,gid=5       0 0
$ fdisk -l

Disk /dev/sda: 128.0 GB, 128035676160 bytes
255 heads, 63 sectors/track, 15566 cylinders, total 250069680 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0009c836

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1            2048     8386559     4192256   82  Linux swap / Solaris
/dev/sda2   *     8386560    92276735    41945088   83  Linux
/dev/sda3        92276736   176168959    41946112   83  Linux

Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

And then some more details about what I’ve done just now:

[ol]
[li]I reinstalled OpenSUSE from scratch. I created the following partitioning scheme:[/li][ul]
[li]/dev/sda (SSD)[/li][LIST]
[li]/dev/sda1 Swap[/li][li]/dev/sda2 /[/li][li]/dev/sda3 /home[/li][/ul]

[li]/dev/sdb (1 TB platter)[/li][ul]
[li]/dev/sdb1 /data, set to not mount automatically on system startup (*)[/li][/ul]
[/ol]

[li]After rebooting, I entered a terminal, used su to get root, and tried mount /data. This gave the error that there was no such partition. fdisk indeed showed that /dev/sdb was completely empty![/li][li]I then went into the partitioning tool, created a new partitioning table on /dev/sdb1 (type: MSDOS/MBR), and created one partition /dev/sdb1, formatted as ext4.[/li][li]I could then succesfully mount /data (mount /data), put a file on it, etc.[/li][li]I then executed umount /data, and then remounted using mount /data. I got the error “mount: wrong fs type, bad option, bad superblock on /dev/sdb1, missing codepage or helper program, or other error”. The dmesg log showed two errors: “EXT4-fs (sdb1): ext4_check_descriptors: Checksum for group 128 failed (18031!=0)” and “EXT4-fs (sdb1): group descriptors corrupted!”.[/li][/LIST]

(*) The reason for not mounting automatically on startup, is that if I do this, OpenSUSE refuses to boot, and kicks me into rescue mode. The reason for this is the failure to mount /data.

Is the above information any help?

Well thought. Best is to first get it working by calling the mount manually.

What we see, is that fdisk doesn’t show any partitions on the disk. My 2 cents are that the disk uses a GPT instead of MBR. To validate this, run


su -c 'zypper in gdisk'

and next


su -c 'gdisk /dev/sdb'

. Post the first ten lines of the output here. The problem would be that Yast uses fdisk, which cannot handle GPT disks.

On 2013-06-06 14:36, onnodb wrote:

>
> (*) The reason for not mounting automatically on startup, is that if
> I do this, OpenSUSE refuses to boot, and kicks me into rescue mode. The
> reason for this is the failure to mount /data.

Good enough.

> Is the above information any help?

I seems that stuff is not being written to that hard disk. Perhaps you could write a large block to
it, raw, with dd to see if there are errors in the log.

Run the long smart test.


Cheers / Saludos,

Carlos E. R.
(from oS 12.3 “Dartmouth” GM (rescate 1))

A


dd if=/dev/zero of=/dev/sdb count=100000

would destroy possible GPT and at the same time confirm the disk works.

First of all, the output of gdisk:

$gdisk /dev/sdb
GPT fdisk (gdisk) version 0.8.5

Partition table scan:
  MBR: not present
  BSD: not present
  APM: not present
  GPT: not present

Creating new GPT entries.

Command (? for help): 


Then some experiments with using dd (after getting the above output from gdisk):


$ dd if=/dev/zero of=/dev/sdb count=100000
100000+0 records in
100000+0 records out
51200000 bytes (51 MB) copied, 1.08377 s, 47.2 MB/s

$ dd if=/dev/zero of=/home/onno/test.bin count=100000
100000+0 records in
100000+0 records out
51200000 bytes (51 MB) copied, 0.185385 s, 276 MB/s

$ md5sum /home/onno/test.bin 
fcea9ecefdc5e4b1f028fc282ad71de2  /home/onno/test.bin

$ dd if=/dev/sdb of=/home/onno/test.sdb.bin count=100000
100000+0 records in
100000+0 records out
51200000 bytes (51 MB) copied, 0.336165 s, 152 MB/s

$ md5sum /home/onno/test.sdb.bin 
fcea9ecefdc5e4b1f028fc282ad71de2  /home/onno/test.sdb.bin

Waiting for the long test to finish, will post back here when it’s done.

OK, the long smartctl test is done. Results:


$ smartctl -a /dev/sdb

smartctl 6.0 2012-10-10 r3643 [x86_64-linux-3.7.10-1.1-desktop] (SUSE RPM)
Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST1000DM003-1CH162
Serial Number:    Z1D4C96K
LU WWN Device Id: 5 000c50 05097dd32
Firmware Version: HP33
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Thu Jun  6 16:00:09 2013 CEST

==> WARNING: A firmware update for this drive may be available,
see the following Seagate web pages:
http://knowledge.seagate.com/articles/en_US/FAQ/207931en
http://knowledge.seagate.com/articles/en_US/FAQ/223651en

SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)    Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:         (  584) seconds.
Offline data collection
capabilities:              (0x53) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      ( 113) minutes.
SCT capabilities:            (0x303b)    SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   117   100   006    Pre-fail  Always       -       154879496
  3 Spin_Up_Time            0x0023   098   098   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       62
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002f   100   253   030    Pre-fail  Always       -       229023
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       46
 10 Spin_Retry_Count        0x0033   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       62
180 Unknown_HDD_Attribute   0x002a   100   100   000    Old_age   Always       -       331085418
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0033   100   100   097    Pre-fail  Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0 0 0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   067   065   045    Old_age   Always       -       33 (Min/Max 24/35)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       61
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       313
194 Temperature_Celsius     0x0022   033   040   000    Old_age   Always       -       33 (0 20 0 0 0)
196 Reallocated_Event_Count 0x0032   100   100   036    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%        46         -
# 2  Extended offline    Interrupted (host reset)      00%        44         -
# 3  Short offline       Completed without error       00%        43         -
# 4  Extended offline    Interrupted (host reset)      00%         2         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


Mmm, don’t see anything suspicious.

Use Yast to create a partition on the disk, mountpoint /data, ext4 for the file system, leave defaults for the rest. After doing so, it should be mounted, and set to mount at boottime
Now reboot, and check whether the partition is mounted. Yes, I know you’ve already tried that, but things have been done to the disk. Report what happens.

Thanks for staying around, I really appreciate the effort!

Use Yast to create a partition on the disk, mountpoint /data, ext4 for the file system, leave defaults for the rest. After doing so, it should be mounted, and set to mount at boottime
Now reboot, and check whether the partition is mounted. Yes, I know you’ve already tried that, but things have been done to the disk. Report what happens.

OK, tried that, and the same result as before: after creating the partition (first had to create a new MBR partition table, though), everything seems OK. Then after a reboot, I end up in an emergency terminal, because the system cannot mount the partition. After setting the ‘noauto’ flag for /dev/sdb in fstab, I can reboot, but the Yast partitioning tool reports that the entire disk is unpartitioned.

It almost seems like changes to the disk somehow end up in some sort of cache, but don’t properly get written to disk; and then unmounting the disk doesn’t flush that cache??

I’d send the disk back to the IT department for replacement if I’d have enough pointers that something is physically wrong with the disk, but the fact that a RedHat or Windows 7 installation on the same disk seems OK makes me wonder if I’m perhaps missing out on something else?

onnodb wrote:
> I’d send the disk back to the IT department for replacement if I’d have
> enough pointers that something is physically wrong with the disk, but
> the fact that a RedHat or Windows 7 installation on the same disk seems
> OK makes me wonder if I’m perhaps missing out on something else?

If you partition the disk with RedHat and then get YaST to use those
partitions, does that work?

(Not a complete solution, but might give some more information)

On 2013-06-06 15:56, onnodb wrote:

> Then some experiments with using dd (after getting the above output
> from gdisk):
>
>


> --------------------
>
>   $ dd if=/dev/zero of=/dev/sdb count=100000
>   100000+0 records in
>   100000+0 records out
>   51200000 bytes (51 MB) copied, 1.08377 s, 47.2 MB/s
>
>   $ dd if=/dev/zero of=/home/onno/test.bin count=100000
>   100000+0 records in
>   100000+0 records out
>   51200000 bytes (51 MB) copied, 0.185385 s, 276 MB/s

Wait.

If /home is where that disk is mounted, notice that the previous command would have destroyed it. It
is impossible you can write a file to /home after that dd operation, without formatting again.

Some laptops have this combo of small flash, large hardisk, where the flash acts as a cache of the
larger, mechanical disk. Maybe the writes are simply not going to the hard-disk. This type of setup
is not supported by Linux, only by Windows. It has a name which I don’t remember.

There are even some HDs that contain this combo transparently to the OS, with only one SATA interface.


Cheers / Saludos,

Carlos E. R.
(from oS 12.3 “Dartmouth” GM (rescate 1))

I was thinking about this too, even called a friend to boot his laptop (with one interface for a caching SSD + a HDD) from a LiveCD, he stated that it only shows /dev/sda. Not the case here. But it could be that you’re touching the origin of the issue.
Onno, is there any special reference in the BIOS, or does the BIOS report two “ordinary” disks?

BTW Carlos, I checked the entire thread, to find that the disk wasn’t mounted on /home, plus that the OP erased and repartioned the disk already, so there wouldn’t be any data on it. I know the power of “dd”, by bad experiene :D.

On 2013-06-06 23:36, Knurpht wrote:
>
> robin_listas;2563123 Wrote:

> I was thinking about this too, even called a friend to boot his laptop
> (with one interface for a caching SSD + a HDD) from a LiveCD, he stated
> that it only shows /dev/sda. Not the case here. But it could be that
> you’re touching the origin of the issue.
> Onno, is there any special reference in the BIOS, or does the BIOS
> report two “ordinary” disks?

I think there are 3 possibilities.

  • 2 completely separate devices.
  • 2 devices, seen as one, transparently, by any OS.
  • 2 separate devices handled by the BIOS or the OS as a joined combo. Linux, AFAIK, does not
    support this kind.

> BTW Carlos, I checked the entire thread, to find that the disk wasn’t
> mounted on /home, plus that the OP erased and repartioned the disk
> already, so there wouldn’t be any data on it. I know the power of “dd”,
> by bad experiene :D.

Oh, right, he was manually mounting somewhere else. Yep, on “/data”. In that case, I don’t see the
purpose of “dd if=/dev/zero of=/home/onno/test.bin count=100000”.

Yes, I also have met the power of dd :wink: but in this case I knew that there would be no
consequences (lost data), the system is in testing phase post install. What I meant is that after
erasing the partition table and the start of the disk, it should be impossible to write to a file.


Cheers / Saludos,

Carlos E. R.
(from oS 12.3 “Dartmouth” GM (rescate 1))

OK, let me clear up some of the confusion — sorry 'bout that.

  • The disks are two separate entities, and are also recognized by the BIOS as such. The workstation does have some sort of hardware RAID and disk spanning capabilities, but I’ve not enabled those. And, puzzlingly, RedHat and Windows seem to have no problems with the disks at all.
  • With the dd tests I ran in post #06-Jun-2013, 13:55, I wrote to the offending disk directly (first command), and
    to a file in my /home mounted on the SSD — just to check & compare. But the gist of the test was that writing to the offending disk with dd seemed to work fine: I could write and read back without problems.

The overall behavior tends to be that I can use the platter disk successfully until an unmount; and after rebooting/remounting, everything’s scrambled & gone.

OK, and to add to the mystery: Debian 7.0.0 installs fine, and has no problems with the harddisk either. I’m now reinstalling OpenSUSE, to see if it can use the harddisk as partitioned by the Debian installer…

OK, so I think I’ve worked around the problem now, by partitioning the second harddisk using the Debian installer. Still no clue why this is the case, so if anyone can help me understand what’s going on, I’d be very happy :slight_smile:

Thanks for helping out with all the troubleshooting thus far anyway!

Hi Onnodb and others.

Amazingly I have the exact same problem as you. But I was hopping for another solution…

In short I have a brand new Z820 HP work station with 3 disk: One 128 GB SSD on which I installed opensuse 12.3 and two 3TB disk.

Exactly as you when I reboot after partitioning my 3TB disk (I tried different recipes, including gparted) my partitions are gone. If I try to automatically mount my partition through /etc/fstab then, exactly like you, opensuse fails to start. My warning message is:
Time out waiting for device dev-sdb1
Dependency failed for /mnt

I thought this was because my disks where 3 TB but I found help on how to parittion them with gparted. It seemed to work fine until the computer was restarted. This is driving me nuts. I am not a linux expert. I am a researcher in neuroscience and after struggling a full day on this this becoming a serious issue .

If somebody was kind enough to help I would really appreciate it.
I can output error messages but I am not sure what will be helpful. Thanks a lot.