Volume Group disappeard

The problem: Logical volume group on my main drive appears to have disappeared, /boot is a primary partition with everything else inside the LVM (Discovered from googling this problem and where I’m at now because of it that setting things up this way probably wasn’t such a bright idea)

Circumstances:

Noticed mails had stopped working, on trying to restart dovecot & postfix systemctl just kept freezing, cut a long story short I ended up trying a reboot, system wouldn’t boot back up and on investigating why came across an ata.1 error which I’d seen before and knew it be the sata controller the drive is plugged into failing

Not wanting to just plug into another controller and put my trust in a board that’s just started showing signs that it might be dying I decided to plug into a spare machine I had here with no hdd, use a live OS disk to make some necessary edits, stuff like fstab, menu.lst etc then try booting off the drive in the ‘new’ machine

Got a grub error about being unable to find the root partition, booted back into the live cd and being pretty sure I’d made my grub edits correctly I decided to have a look at the drive layout in Yast’s Partition Manager, everything seemed to be right and the group and it’s volumes were there plain as day so I tried another reboot, got the same thing again, took the drive out and put it in a running machine for another look

This is where the volume group had disappeared, nowhere to be found in yast’s partitioner or kde’s partitioner, ran lvdisplay and lvmdiskscan, both report no volume groups found

Some web searching suggests it’s quite a common issue after resizing logical volumes (which hasn’t been done in this case) and people have successfully restored the group using the group’s metadata files in /etc/lvm but of course in this case the /etc/lvm directory is inside the group I can’t access (boy am I kicking myself now!)

The timing of this happening couldn’t have been worse, it happened whilst I was in the act of installing a new drive in the machine I usually have the backups stored on which to cut a really long story short meant it happened exactly at a time I was without backups of some really important files, a gazillion to one chance but it happened to me

Any suggestions on whether there’s anything I can try to get the group back up so I can get at the files? Or even something I can use to recover the files if I can’t restore access to the group itself?

On 08/22/2012 03:16 AM, Ecky wrote:
> Any suggestions

i am no help at all on LVM (to me, it is just another layer of
complications that can go wrong)…so, this is just something to tide
you over until a real LVM guru in disaster mitigation comes along:

  1. mount the drive READ ONLY until such time as you have given up all
    hope of recovery (every write you make could be covering up/destroying
    valuable data!!)

  2. this page is one place to begin learning how to recover:
    http://en.opensuse.org/Portal:Digital_Forensics_/_Incident_Response

  3. after you have given up: still don’t WRITE mount it, and if you
    still want data recovery send the drive to an expert in the field (the
    net is loaded with ads offering data recovery services–the folks who
    know what they are doing are very expensive if you go for the cheap
    guy in your neighborhood you might get lucky, you might not.)


dd http://tinyurl.com/DD-Caveat

Ecky wrote:
> The problem: Logical volume group on my main drive appears to have
> disappeared, /boot is a primary partition with everything else inside
> the LVM (Discovered from googling this problem and where I’m at now
> because of it that setting things up this way probably wasn’t such a
> bright idea)

Please show the output of three commands run one after the other:

pvscan
vgscan
lvscan

pvscan
  No matching physical volumes found
vgscan
  Reading all physical volumes.  This may take a while...
  No volume groups found
lvscan
  No volume groups found

Sorry for the delay replying, I’d posted the question just before going to bed

I’m about to plug another drive in and give ddrescue a try, not a tool I’ve ever had cause to use before so will post back if I have any luck, cheers guys

Ecky wrote:
> Code:
> --------------------
> pvscan
> No matching physical volumes found
> --------------------
>
>
>
> Code:
> --------------------
> vgscan
> Reading all physical volumes. This may take a while…
> No volume groups found
> --------------------
>
>
>
> Code:
> --------------------
> lvscan
> No volume groups found
> --------------------

Hmm, doesn’t seem like any LVM problem I’ve ever seen. More of a
hardware fault/configuration issue. Others will be better able to help
you find the problem, I think.

All I can think mate is that when I checked the drive layout in Yast’s partitioner the partitioner did something when I closed it, think I closed it with the Finish button, perhaps I should’ve used Abort instead

When I look at the drive in any partitioner now it’s showing as unallocated space now where the volume group ‘should’ be, it’s lookin like more of a data recovery job than ‘fixing’ lvm … but I don’t know if there’s any data recovery softwares that will recognise a volume group that ‘used to be there’

ddrescue is running on the drive now but I expect it will give me an image of the drive in it’s current state, not at all confident it’s gonna allow me to access any of the files that were on any of the volumes

Ecky wrote:

> When I look at the drive in any partitioner now it’s showing as
> unallocated space now where the volume group ‘should’ be,

You’re not helping by not providing any evidence to work with! Please
show the fdisk listing of the drive.

Note that it is PVs you are looking for, not VGs. You might try

http://www.centos.org/docs/5/html/Cluster_Logical_Volume_Manager/mdatarecover.html

My apologies djh-novell, didn’t realise I should’ve posted the fdisk info but now you’ve asked me for it, here it is

Disk /dev/sda: 1000.2 GB, 1000203804160 bytes
254 heads, 62 sectors/track, 124048 cylinders, total 1953523055 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00023b12


   Device Boot      Start         End      Blocks   Id  System
/dev/sda1            2048     1028095      513024   83  Linux

Gotta rush out for a dentist appointment, will check out the link you gave me when I get back, cheers

That isn’ much. Where is the /boot partition you were talking about.

And of course when you click Finish instead of Abort, something is going to happen.

Also my idea, when no PV is found there isn’t much to do.

On 2012-08-22 17:06, hcvv wrote:
>
> That isn’ much. Where is the /boot partition you were talking about.
>
> And of course when you click Finish instead of Abort, something is
> going to happen.
>
> Also my idea, when no PV is found there isn’t much to do.

I don’t know if gpart could find out where the partitions should be.

I’m afraid that things like this are what refrains me form using LVM. I don’t know how to
recover from its problems, so i don’t use LVM despite its advantages…

What about a backup?


Cheers / Saludos,

Carlos E. R.
(from 12.1 x86_64 “Asparagus” at Telcontar)

hccv /dev/sda1 is the /boot partition

robin I wish I hadn’t used it now mate, as for backups a lot of stuff I have backed up but some wasn’t at the time this happend for reasons too long-winded to go into here

I’m trying out some partition recovery software at the moment but until it finishes scanning I don’t know how much success I’ll have

And is the size there correct?

When you only borked the partition table you can restore it. And when all the pointers there are correct again, you will find the contents of the partitions untouched (and thus your PVs back). But for restoring it you must know what it was. E.g. by having an fdisk -l listing of it (but you do not seem to good at making backups of important information :frowning: ).

I am not sure if a partition recovery program will help you. IMHO it must rely on heuristic methods in searching for known patterns that point to e.g. a file system start. For a Linux disk to be of value it must have knowledge of Reiserfs, ext2/3/4, btrfs, etc. And in this case it should be able to recocnise a PV administrative block. Thus choose your tool carefully. One that is good on disks used by Windows will not do I guess. But I have no experience.

BTW, when you post asked information (like the fdisk -l above), please post it complete with the prompt, the command and the new prompt. Only a few mm of mouse sweep more and you convey much more information in the post (like working directory, root or not, exact command used, completeness of output).

On 2012-08-22 20:16, hcvv wrote:
> I am not sure if a partition recovery program will help you. IMHO it
> must rely on heuristic methods in searching for known patterns that
> point to e.g. a file system start. For a Linux disk to be of value it
> must have knowledge of Reiserfs, ext2/3/4, btrfs, etc. And in this case
> it should be able to recocnise a PV administrative block. Thus choose
> your tool carefully. One that is good on disks used by Windows will not
> do I guess. But I have no experience.

If the partition table is bad, the first recovery tool to try is gpart.


Cheers / Saludos,

Carlos E. R.
(from 12.1 x86_64 “Asparagus” at Telcontar)

Do you mean the size of the drive itself mate or the size of the boot partition that’s on it?

If I’d realised I’d ever need a copy of the output of fdisk -l I would have it saved somewhere (I do know though that it had the existing boot partition and the rest of the space taken up by the partition containing the volume group). Some of my file/folder backups were temporarily moved into an nfs share on the drive that I’m having the problems with while I was putting a new drive in and reinstalling the OS on the machine they are usually stored on … I thought it safer to move them off the machine I was working on until I’d finished, then I was going to move them back. Halfway through doing this the problem with motherboard happened, like I said earlier, million to one chance of it happening just at that time, but it did

I didn’t think of that mate, thought it would have been less info for anyone helping to sift through if I just posted the part relevant to the disk in question, here’s the complete output of fdisk -l:

linux:~ # fdisk -l

Disk /dev/sda: 1000.2 GB, 1000203804160 bytes
254 heads, 62 sectors/track, 124048 cylinders, total 1953523055 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00023b12

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1            2048     1028095      513024   83  Linux

Disk /dev/sdb: 60.0 GB, 60022480896 bytes
255 heads, 63 sectors/track, 7297 cylinders, total 117231408 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00046ae7

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1              63     4209029     2104483+  82  Linux swap / Solaris
/dev/sdb2         4209030    25189919    10490445   83  Linux
/dev/sdb3   *    25189920    52468289    13639185   83  Linux
/dev/sdb4        52468290   117226304    32379007+   5  Extended
/dev/sdb5        52468353    83939624    15735636   83  Linux
/dev/sdb6        83939688   117226304    16643308+  83  Linux

Disk /dev/sdc: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x7e527e52

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1              63    83875364    41937651    7  HPFS/NTFS/exFAT
/dev/sdc2        83875365  1869644699   892884667+   7  HPFS/NTFS/exFAT
/dev/sdc3      1869644700  1911590414    20972857+  83  Linux
/dev/sdc4      1911590415  1953520064    20964825   83  Linux
linux:~ #

Sorry robin I misread the first time and thought you’d said gparted, never knew about gpart so I ran it, here’s what I got

linux:~ # gpart /dev/sda

Begin scan...
Floating point exception
linux:~ #  

On 2012-08-22 23:46, Ecky wrote:

> Sorry robin I misread the first time and thought you’d said gparted,
> never knew about gpart so I ran it, here’s what I got
>
>
> Code:
> --------------------
> linux:~ # gpart /dev/sda
>
> Begin scan…
> Floating point exception
> linux:~ #
> --------------------

Wow!

You would need another version…

And you can also report that in bugzilla. Not that it will help you directly, but…


Cheers / Saludos,

Carlos E. R.
(from 12.1 x86_64 “Asparagus” at Telcontar)

I’ve since ran it again mate and got this

linux:~# gpart /dev/sda

Begin scan...
Possible partition(Linux ext2), size(953867mb), offset(0mb)
End scan.

Checking partitions...
Partition(Linux ext2 filesystem): primary 
Ok.

Guessed primary partition table:
Primary partition(1)
   type: 131(0x83)(Linux ext2 filesystem)
   size: 953867mb #s(1953520000) s(63-1953520062)
   chs:  (0/1/1)-(1023/254/63)d (0/1/1)-(121600/254/61)r

Primary partition(2)
   type: 000(0x00)(unused)
   size: 0mb #s(0) s(0-0)
   chs:  (0/0/0)-(0/0/0)d (0/0/0)-(0/0/0)r

Primary partition(3)
   type: 000(0x00)(unused)
   size: 0mb #s(0) s(0-0)
   chs:  (0/0/0)-(0/0/0)d (0/0/0)-(0/0/0)r

Primary partition(4)
   type: 000(0x00)(unused)
   size: 0mb #s(0) s(0-0)
   chs:  (0/0/0)-(0/0/0)d (0/0/0)-(0/0/0)r

linux:~# 

Don’t know enough about this kind of thing to tell whether that’s gonna be useful information or not

On 2012-08-23 01:36, Ecky wrote:

> Don’t know enough about this kind of thing to tell whether that’s gonna
> be useful information or not

It does not find the LVM partition :frowning:


Cheers / Saludos,

Carlos E. R.
(from 12.1 x86_64 “Asparagus” at Telcontar)

Looks that way mate

Looking like it’s one for chalking up to experience, reinstalling the system on another drive as we speak … and I am NOT using lvm this time

Thanks for the complete output. It is allways better to provide a complete output (except when very large). After all you have the problem and ask for help. Let the helpers decide what is relevant.

You seem to have another Linux on sdb. Just curious, is it still possible to boot it?

Back to sda. Yes I wanted to know if sda1 is of the correct size to be the /boot partition.
What I find strange also is that it starts at 2048 and not at 63 like the others. What was before it?
I guess these are rather rethorical questions where you do not know the answer.

I do not know the tool gpart. The output is more or less a dump of the partition table. The entries of the primary partitions 2, 3 and 4 are filled with 0s. But I can not see if that tool did anything more then checking the partition table, like searching the disk for relevant information. When it does.not, it is of no more value then fdisk (in this case).

As it is, I join the others and see no solution.

BTW, the contents of /etc is backed up in my normal back-up routine. And I also run regulary some statements (fdisk -l amoongst them) where I save the output and that goes also with the backup. Managing systems is not a task to walk over lightly.

On 2012-08-23 12:36, hcvv wrote:
> I do not know the tool gpart. The output is more or less a dump of the
> partition table. The entries of the primary partitions 2, 3 and 4 are
> filled with 0s. But I can not see if that tool did anything more then
> checking the partition table, like searching the disk for relevant
> information. When it does.not, it is of no more value then fdisk (in
> this case).

Normally gpart scans the sectors and tries to find where the partitions start and end, allowing
to directly use fdisk to recreate the original partition table. The few times I’ve used it, it
succeeded - but I’ve never tried to use it on LVM layouts, which I suppose are more complex
because they can have stripes (or whatever they are called).

There was a post here by a poster that has been banned, and the post deleted, with a link to a
software that claimed to be able to partially repair LVM. Who knows, but I’m curious.


Cheers / Saludos,

Carlos E. R.
(from 12.1 x86_64 “Asparagus” at Telcontar)