Setup hosed after update. can not boot into server at all

Hi, here is my issue.

I have an OpenSuse 13.1 LXDE setup as home server.
running Xen 4.3 and Samba.
I am using full BTRFS install.
that is my system drive is 1T with “/” on BTRFS and Swap partitions on it.
and second 1T drives holds my “/home” partition.
booting from “/” partition directly. as far as i can see I have no “/boot” partition anywhere in sight. this is how the OpenSuse set it up on my last install couple of month ago. I was having issues with the drive back than so I got a 1T seagate or somthing and used that as main drive.
I also have Data drives 1x2T drive with BTRFS
and 2x3T with multidevice Raid 1 BTRFS partition
all is mounted in fstab on boot as datashares 1 and 2

Xen installed and running.
Samba configured to share the data drives to all for full access.

all was working fine until 2 days ago.
I have been moving my media collections and other soft onto the server
and setting up VMs

Ubuntu Server for subnzb+sickbeard and others
and Sophos UTM
all was working no issues or anything.

2 days ago I run zypper up. again did not see any issues with updates. no errors or anything strange.

but after reboot the server never cam back up.

it boots to POST
hets to the balck screen where grub prompt usually displayed. you know welcome to grub2 etc. in top left corner, but now I only see a blinking curor there which hold for a sec or two than skips 2 lines down and sits there blinking. no further actions are visible. system is not ping able or reachable (I get there via IPMI so I can see the boot process but this luxury will be gone in the next week or two as I am replacing the MB for a newer one which does not have the IPMI option)

can I recover from this?

I booted the server from Recovery CD into XFACE and I can see the drives in Yast/Partitioner . what can I do?

I need a noob kind of help as I never fixed anything like this before. I would simply reinstall the OS but I have too much stuff loaded there to just loose all.

also can some one help me in setting up a recovery procedure to prevent this from happening again or ease the recovery.

thanks.

Boot from the recovery CD again, drop to a terminal, and give me the output of:

fdisk -l

Note that that is a lower case L, not a numeral 1.

You may already know this, but just in case:

To do that, while in the terminal, sweep and choose the output with your mouse cursor, including the command you issued (ie: All terminal output), then Ctrl-Shift-C will copy it to your clipboard.

Then post it in a message here. To do that, start your message, then at the upper right side of the forum message editor, right side of the middle row, click on the # sign. That will give you Code tags, with your cursor blinking in between them. Paste the contents of your clipboard there (Ctrl-V).

We will continue from there.

will try to do it tonight thanks.

here is the output of fdisdk-l

as far as I can tell

/dev/sda is the rescue CD
/dev/sdb and /dev/sdc are my 3T drives that should be BTRFS RAID 1 volume

/dev/sdd is my 2T BTRFS disk

/dev/sde
and /dev/sdf are the system disks one for /home and one for for “/”

looking in gparted sde is my boot “/” disk
and sdf is my /home disk
strangly enought both disks have swap partion on it.
I think I defined one on both , can’t remeber now
but sde1 is shown as btrfs with boot/legacy boot flags on it.

  linux@linux:~> sulinux:/home/linux # fdisk -l


Disk /dev/sda: 2102 MB, 2102394880 bytes, 4106240 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0xa19f8ba9


   Device Boot      Start         End      Blocks   Id  System
/dev/sda1            4084       12275        4096   ef  EFI (FAT-12/16/32)
/dev/sda2   *       12276     1232895      610310   83  Linux
/dev/sda3         1232896     4104192     1435648+  83  Linux


Disk /dev/sdb: 3000.6 GB, 3000592982016 bytes, 5860533168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk label type: dos
Disk identifier: 0x00000000


   Device Boot      Start         End      Blocks   Id  System
/dev/sdb4               1           1           0+  ee  GPT
Partition 4 does not start on physical sector boundary.


Disk /dev/sdc: 3000.6 GB, 3000592982016 bytes, 5860533168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk label type: dos
Disk identifier: 0x00000000


   Device Boot      Start         End      Blocks   Id  System
/dev/sdc4               1           1           0+  ee  GPT
Partition 4 does not start on physical sector boundary.


Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes, 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x00000000


   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1            2048  3907028991  1953513472   83  Linux
/dev/sdd4               1           1           0+  ee  GPT


Partition table entries are not in disk order


Disk /dev/sde: 1000.2 GB, 1000204886016 bytes, 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk label type: dos
Disk identifier: 0x00000000


   Device Boot      Start         End      Blocks   Id  System
/dev/sde1   *        2048  1743808511   871903232   83  Linux
/dev/sde2      1743808512  1953523711   104857600   82  Linux swap / Solaris
/dev/sde4               1           1           0+  ee  GPT
Partition 4 does not start on physical sector boundary.


Partition table entries are not in disk order


Disk /dev/sdf: 1000.2 GB, 1000204886016 bytes, 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x00000000


   Device Boot      Start         End      Blocks   Id  System
/dev/sdf1            2048  1743808511   871903232   83  Linux
/dev/sdf2      1743808512  1953523711   104857600   83  Linux
/dev/sdf4               1           1           0+  ee  GPT


Partition table entries are not in disk order
linux:/home/linux # 
 

At first glance:

sdf1 appears to be a mirror of sde1

sdf2 appears to be a mirror of sde2

It is odd for a CD/DVD drive to be sd*, it is normally sr*

I need to study this a bit more and think on it. But, in the meantime, maybe someone else might spot something significant.

Recovery disk again.

cd /media
mkdir sde1
mkdir sdf1
mkdir sde2
mkdir sdf2

Then:


mount /dev/sde1 /media/sde1
mount /dev/sdf1 /media/sdf1
mount /dev/sde2 /media/sde2
mount /dev/sdf2 /media/sdf2

Then:


ls -l sde1
ls -l sdf1
ls -l sde2
ls -l sdf2

and post the output here.

Then,


umount sde1
umount sdf1
umount sde2
umount sdf2
shutdown now

Probably he booted the Live system from an USB stick or similar. :wink:
That would indeed show up as /dev/sd*.

To the original problem, it seems that the BIOS just cannot find/load grub, as not even the boot menu shows up IIUIC.

As a first try I would just re-install the boot loader (to the MBR) and see if it boots then:
Boot to the rescue system and do the following:

mount /dev/sde1 /mnt
mount --bind /dev /mnt/dev
chroot /mnt
mount -t proc proc /proc
mount -t sysfs sysfs /sys
grub2-mkconfig -o /boot/grub2/grub.cfg
grub2-install /dev/sdb

But check first (in the BIOS) which hard disk is actually the first one in your boot order, and use that instead of /dev/sdb in the last line.
Maybe just changing the boot order (to boot from sde I suppose) would help already, though?

I was thinking that since, too. :slight_smile:

To the original problem, it seems that the BIOS just cannot find/load grub, as not even the boot menu shows up IIUIC.

As a first try I would just re-install the boot loader (to the MBR) and see if it boots then:
Boot to the rescue system and do the following:

mount /dev/sde1 /mnt
mount --bind /dev /mnt/dev
chroot /mnt
mount -t proc proc /proc
mount -t sysfs sysfs /sys
grub2-mkconfig -o /boot/grub2/grub.cfg
grub2-install /dev/sdb

But check first (in the BIOS) which hard disk is actually the first one in your boot order, and use that instead of /dev/sdb in the last line.
Maybe just changing the boot order (to boot from sde I suppose) would help already, though?

Thanks, Wolfi. You have confirmed what I was thinking, as well. I first want to make certain which is his / partition, then I was thinking of getting Grub re-installed, but also run mkintrd just for certainty.

And, yes, I was going to suggest putting the disk with the / partition as the first boot disk in the BIOS boot order (which we think is sde, but not certain at this point).

Therefore, instead of installing Grub to sdb, he would install Grub to sde, correct?

But, before trying any of that, perhaps he should try first to just change the boot order and see if that is all there is to it, since at this time we are not certain which disk has Grub installed nor if it is MBR or a partition.

I appreciate your help, here.

Yes, if sde is first in the boot order, i.e. the boot code is loaded from it, he should install grub to sde.

But, before trying any of that, perhaps he should try first to just change the boot order and see if that is all there is to it, since at this time we are not certain which disk has Grub installed nor if it is MBR or a partition.

Yes, and we have no idea from which disk his BIOS loads the boot code. And that is what’s going wrong apparently.

The BIOS does not much more than load the MBR from one disk and calls the code there.

A generic boot loader would then look for the active partition and call the boot sector there. If it doesn’t find it (because of the particular disk layout f.e.), there should be an error message, not just a blinking cursor.

And if grub couldn’t load its files you would get the GRUB> prompt.

So the most likely reason here that I can imagine is that the BIOS boots from an MBR that doesn’t actually contain any boot code.

I agree that is the most likely reason. I have the feeling BIOS is trying to boot from one of the RAID volumes.

But, I have also seen exactly the same response he is reporting when Grub is installed to the root partition and the root partition is not marked with the Boot flag.

Well, the same happens if the wrong partition is marked active of course, i.e. if there’s no actual boot code in the active partition’s (the one marked with the Boot flag) boot sector.

In his case this looks good though I would say. The only active partition I see is /dev/sde1, his root partition.

But installing grub to the MBR should “solve” it in this case as well.

Absolutely.:wink:

thanks guys for all your help.

few things however raised some questions for me.

why would I loose a perfectly working setup all of a sidden?
it all was working nice and all and after update bam, nothing boots

also I did not do any thing special during intial install. jusr choosed to use BTRFS as default and split the “/” and “/home” onto 2 drives.

the drives are 1T each noth the same MFG though one is Seagate and the second one is WD both are 1T hence they look like mirror.
that is my fault as I pur a swap partition on both of them during setup (dumb I know)
other drives are not an issue as they are strictly data drives with BTRFS setup after install on raw devices.

I am changing the hardware for the server though as I don’t think my MB and CPU work Ok so I am dumping them and going with AM 8 core FX8350 and ASrock Mb instead and I guess I will need to reinstall the sytem on new setup. will let you know

also I did not do any thing special during intial install. jusr choosed to use BTRFS as default and split the “/” and “/home” onto 2 drives.

Well that is kind special :open_mouth:

BTRFS is still not 100% particularly with some of the fancy stuff like splitting partitions across drives and RAID emulation.

I suspect that the update triggered some process that does not fully understand a complex BTRFS setup which caused the problem

When riding the cutting edge expect to bleed >:)

On 2014-05-25 01:46, vl1969 wrote:
>
> thanks guys for all your help.
>
> few things however raised some questions for me.
>
> why would I loose a perfectly working setup all of a sidden?
> it all was working nice and all and after update bam, nothing boots
>
> also I did not do any thing special during intial install. jusr choosed
> to use BTRFS as default and split the “/” and “/home” onto 2 drives.

If you were using btrfs, you could simply backout to the previous
snapshot. That’s one of its features… but do not ask me how to do it.

>
> the drives are 1T each noth the same MFG though one is Seagate and the
> second one is WD both are 1T hence they look like mirror.
> that is my fault as I pur a swap partition on both of them during setup
> (dumb I know)
> other drives are not an issue as they are strictly data drives with
> BTRFS setup after install on raw devices.
>
> I am changing the hardware for the server though as I don’t think my MB
> and CPU work Ok so I am dumping them and going with AM 8 core FX8350 and
> ASrock Mb instead and I guess I will need to reinstall the sytem on new
> setup. will let you know
>
>


Cheers / Saludos,

Carlos E. R.
(from 13.1 x86_64 “Bottle” at Telcontar)

On 2014-05-25 01:46, vl1969 wrote:
> why would I loose a perfectly working setup all of a sidden?
> it all was working nice and all and after update bam, nothing boots

Forgot to ask:

What is your repo list? You can get it with “zypper lr --details”.


Cheers / Saludos,

Carlos E. R.
(from 13.1 x86_64 “Bottle” at Telcontar)

But this feature won’t help with a broken boot loader setup.

On 2014-05-25 23:56, wolfi323 wrote:
>
> robin_listas;2645420 Wrote:
>> If you were using btrfs, you could simply backout to the previous
>> snapshot. That’s one of its features… but do not ask me how to do it.
>>
> But this feature won’t help with a broken boot loader setup.

I don’t know. :-?

I’m lost in this thread, because I don’t see clearly what happened.

Is the bootloader also in btrfs?


Cheers / Saludos,

Carlos E. R.
(from 13.1 x86_64 “Bottle” at Telcontar)

Apparently his system doesn’t even show a boot menu anymore.
Just a black screen with a blinking cursor.

Is the bootloader also in btrfs?

No. The boot loader code is either in the MBR or in the partition’s boot sector as you should know. :wink:
Not in any filesystem.

grub2’s files would be in the file system of course (and could therefore be restored by the snapshots feature if /boot would be on the BTRFS partition), but his system doesn’t boot that far it seems.

On 2014-05-26 00:36, wolfi323 wrote:
>
> robin_listas;2645440 Wrote:
>> I’m lost in this thread, because I don’t see clearly what happened.
> Apparently his system doesn’t even show a boot menu anymore.
> Just a black screen with a blinking cursor.

Ah… I see.

>> Is the bootloader also in btrfs?
> No. The boot loader code is either in the MBR or in the partition’s boot
> sector as you should know. :wink:
> Not in any filesystem.

Ok, right. Stage 1 is in the boot sector, and stage 2 is just beyond it,
but before the filesystem starts, on nobody’s land. Then it loads the
menu, graphic background, kernel, etc, from the filesystem.

I suppose if stage 2 is loaded, grub should be able to issue some error
messages.

> grub2’s files would be in the file system of course (and could therefore
> be restored by the snapshots feature if /boot would be on the BTRFS
> partition), but his system doesn’t boot that far it seems.

Right… exactly.

Ok, then I’ll go back to lurking here, I don’t have ideas. :slight_smile:


Cheers / Saludos,

Carlos E. R.
(from 13.1 x86_64 “Bottle” at Telcontar)