View RSS Feed

oldcpu's meandering thoughts on Computers, GNU/Linux and openSUSE

Recovering a /home partition with fsck 7000km away

Rating: 3 votes, 5.00 average.
I had an interesting experience today with my +88-year-old mother's computer (running openSUSE-13.1), which is a continent away. After returning from an out-of-country business trip late in the evening the previous night, I woke up this morning to find out that I had received an SMS and email from my sister, who lives in same city as my mother, that my mother's GNU/Linux computer had major problems and was refusing to boot. My sister sent me this picture of the boot screen:

[click on the above for a larger more readable image]

where I noted this:

Code:
 ata1.00: exceptioin Emask 0x0 Sact 0x0 Serr 0x0 action 0x0
 … other errors …
 ata1.00: failed command: READ DMA EXT

 … other errors ...
 ata1:00: status: { DRDY ERR }
 ata1:00: error: { UNC }
 end_request: I/O error, dev sda, sector 1102971143
 Buffer I/O error on device sda7, logical block 40902688
.
I was told that the computer would simply repeat that error sequence over and over.

So I called my sister today and together we worked on my mother's computer with me on the phone.

The first thing I did was confirm that FAILSAFE boot did not work. I was 99% positive that it would not, but I was curious if it would yield any extra information. FAILSAFE gave me this screen (which were pix my sister took with her smartphone and immediately emailed to me):

[click on the above for a larger more readable image]

and this screen :

[click on the above for a larger more readable image]

Interesting things that I picked up was confirmation that partition /dev/sda7 was corrupted, that fsck was recommended to be run WITHOUT the -a or -p options (but run manually) , a note that the 'directory block' was corrupted (which read to be REALLY bad to me), and some recommendations wrt journalctl and systemctl if I were to ever achieve a command prompt.

The fsck never did get past 70% , nor after waiting for an extended time frame, was a command prompt ever reached. With no command prompt, the recommended commands for emergency mode could not be run.

So clearly, without further input, the openSUSE GNU/Linux v.13.1 would not boot.

But some good things.

I happened to know that the /dev/sda7 was the 'home' partition, so that gave me confidence that the underlying openSUSE-13.1 was good.

Hence my main goal, with me > 7,000km away from this computer, was to restore the openSUSE-13.1 boot functionality (so my mother had a PC to use, even if data was lost) and to recover as much of her data as possible, as it was unlikely my +88-year old mother was doing regular backups of her data. Fortunately my mother is a 'hotmail' user, so with her email on the cloud, it was safe from a PC crash.

At my request, my sister booted the PC to a knoppix liveCD (which was the only liveCD she had handy) and opened a terminal. I had her type:
Code:
 su
 /etc/init.d/ssh start
 passwd
and create a password for root. She did that, and passed me the root password over the phone. The 'ssh start' was to start the 'ssh' service so I could remotely access the computer (in North America) from here in Europe.

Knoppix does not block root ssh access by default, nor does it block port 22. Its ssh security comes from not running the ssh daemon by default. In a previous trip to Canada I had set up my mother's router to allow ssh access and to forward ssh connections to her PC. I also had my mother's dynamic IP address mapped via dyndns so that I knew her router's IP address and I could then connect to her computer.

Hence with the ssh daemon now running under a knoppix boot, I then from Europe, ssh'd into my mother's PC in North America, as user root.
Code:
ssh -x root@oldcpu-mom-ipaddress
This was NOT the safest way to do this … it would make more sense to do this as a regular user and then switch to root, but I struggled trying to do this on knoppix as a regular user.

Once inside my mother's computer as user root (with me sitting 7,200km away to be more exact), I felt some relief, as it was now in my hands to continue the recovery. As user root I checked the partition /dev/sda7 could be seen with 'fdisk -l' (which it could be seen).

I created an 'oldcputest' directory and mounted the partition with the command :
Code:
mount /dev/sda7 oldcputest
to confirm it could be mounted.

I then wanted to see what file system the /dev/sda7 was using, so I then typed:
Code:
mount
and observed it was an ext4. Then I unmounted the partition with:
Code:
umount /dev/sda7
and then I scratched my head a bit. Originally I hand planned, if this was an ext4 partition, to run 'fsck' with the -p or with the -a option, but here the FAILSAFE error messages when booting were telling me NOT to do this.

I did not have much time to research, so in the end, I simply ran 'fsck' with no options ... ie ... :
Code:
fsck /dev/sda7
and then was presented with a number of repair options. I selected 'y' (yes) for all the repairs, and after what must have been 3 or 4 dozen repair requests, I received indication that the partition was finally clean.

The above took about 15 to 20 minutes … which was my vulnerability time with root access open (albeit password needed). Of course a hacker would need to guess the password my sister created to gain access as root, but still my approach was not the smartest way to do this.

With the file system clean, I then had my sister reboot the PC.

My mother's PC booted to openSUSE-13.1 ok !! , but the desktop came up as if it was being booted for the very first time. I was able to see this by using 'vnc' to remotely take over my mother's desktop once my sister advised me it had booted (in fact my sister was the one to suggest I take over the desktop at that point).

All my mother's data was gone from /home/mothecpu. Instead there was a massive “lost+found” directory, which I could only enter with root permissions, the contents of many directories of recovered data from my mother's /home/mothercpu user directory (albeit directory names were all lost).

I was then able to restore her desktop to how it was before, by gradually copying the files, in some cases directory by directory, back to my mother's /home/mothercpu partition.

I was very happy with this success, as was my mother and sister.

Ok … not that dramatic … likely there was an easier and much safer way to do this without losing all her data (like I came close to doing) … but I had no time to research this nor seek help on the forum. I had only a few hours today to do this, and I will be at the office all day working tomorrow preparing for a business trip and some major technical reviews (on my paid job), and then I depart for a week out of country on business the next day - with unclear internet access (ie possibly no chance to work on my mother's PC).

Not exactly a saturday relaxing and shopping and watching movies, but rewarding never the less.
.

Submit "Recovering a /home partition with fsck 7000km away" to Digg Submit "Recovering a /home partition with fsck 7000km away" to del.icio.us Submit "Recovering a /home partition with fsck 7000km away" to StumbleUpon Submit "Recovering a /home partition with fsck 7000km away" to Google Submit "Recovering a /home partition with fsck 7000km away" to Facebook Submit "Recovering a /home partition with fsck 7000km away" to Twitter

Updated 20-Feb-2015 at 06:42 by oldcpu

Categories
Uncategorized

Comments

  1. brunomcl's Avatar
    Hi Oldcpu,

    Very nice writeup, thanks.

    Did the filesystem developed subsequent errors?

    Or perhaps there was a power outage right before (or forced shutdown)?

    My first guess would be a failing HDD, with a speedy substitution in order.

    Or the HDD still works after all these months?

    Just curious, thanks again.
  2. oldcpu's Avatar
    Quote Originally Posted by brunomcl
    Did the filesystem developed subsequent errors?
    Everything worked great, until 3 days ago (~7-July) , when a very similar problem occurred. The recovery of my sister and myself was pretty much identical. This time there was far far less directory structure damage, and I did not have to spend any time recovering directories from "Lost and Found".

    Quote Originally Posted by brunomcl
    Or perhaps there was a power outage right before (or forced shutdown)?
    There was a power outage, but it was a week before and not right before. I can't tell if there were forced shutdowns. I asked my mother how she shuts down the computer , and she does shut it down via the shutdown menu , but also get the impression she switches off the power (via a power bar) immediately after the GUI disappears, which is too soon if that is what she does. But she claims she waits until it is fully shut down, so I will need to wait and watch her shutdown a few times. She is 89-years old now, so I can't complain wrt her computer knowledge. She pretty much amazes everyone in the retirement complex where she lives.

    Quote Originally Posted by brunomcl
    My first guess would be a failing HDD, with a speedy substitution in order.
    This is my fear - a failing HDD. But it will be approximately 2 months before I make the long > 7000 km voyage to see my mother, and have physical access to her computer.
  3. oldcpu's Avatar
    I have physical access to my mother's PC (with me now visiting her in North America), and my approach was to keep her old openSUSE-13.1, and to also install a new hard drive with two openSUSE-13.2 installs on that hard drive.

    I indirectly described my approach here in this separate blog entry https://forums.opensuse.org/entry.ph...tions-in-Grub2
    Updated 31-Aug-2015 at 08:30 by oldcpu