Grub problem? OpenSuse 11.0 on S5000XVN dual Xeon Quadricore

Hi,

I have been trying to install OpenSuse 11.0 on an Intel S5000XVN motherboard with 6 SATAs disks using ACHI mode (not the BIOS RAID mode).

Since OpenSuse installation identified the 6 HDs, I am not using BIOS RAID, and there is no RAID-BIOS driver for OpenSuse 11, I proceeed without any third-party driver.

Besides having to use edd=off on the instalation linux boot, everything seems fine during installation that completes and the system is then booted from the running kernel.

However, if the machine is actually booted, the “black screen” appears: no grub message, nothing… It seems that there is a problem with grub installation. The option to repair the installation did not work but I could boot in rescue mode. However, I could not do much as I am a newbie in OpenSuse.

I am still trying to figure out the problem and making some installation tests:, but does anyone has some hint that can help me?

By the way, OpenSuse 10.3 or CentOS 5.2 installations could not start on this system (at least, without a third-party driver).

Thanks.

ACHI
I had to look it up
Advanced Host Controller Interface - Wikipedia, the free encyclopedia

You are new to linux, but this goes over my head.

Grub is generally installed to the MBR of one drive (or OTHER in your case). That drive needs to be first boot device in BIOS. During install you should have set mount pints for other drives/partitions/OS’s.
You didn’t say if anything else was installed here?

No, I am not new to Linux. I have been using it for almost 12 years, but I am new user of the OpenSuse distribution. However, I have never had problems with grub before and I do not know this piece of software in details.

It seems that there is a bug in grub installation because, in my last attempt I watched the output in terminal 1 during the installation and I could see a error message from perl complaining about an valuie ina parameter. However, I could not read it in details because the system was rebooting.

I tried to use a extra PATA HD with a standard pattern for the partitions (besides the 6 SATA drives) and it does not work either.

I managed to run the repair tool this time. One odd thing is that the system could not understand the software RAID5 for /home and RAID0 for /scratch I have built in the installation. Another bug in the repair tool or got problem with the ACHI driver?

Using the repair tool, I am goign to try to replace grub by Lilo and see what will happen.

Thanks for the reply. It seems that the old users did not bother to answer.

Wondering if you ever resolved your issue.

I have the exact same problem, however, if I go into the BIOS, and go to the boot manager and manually select booting from the harddrive, the GRUB appears, and openSUSE will boot.

Steve

Sounds to me like you are describing changing the HD boot order in BIOS?

Grub will only appear if you make the drive with Grub on the first boot device.

I have found out the problem origin but not a convenient solution.

The problem that the parameter edd=off indicates that the linux kernel could not get from the BIOS which one is the boot HD. Since there are 6 of them, GRUB does not know where to boot from.

This only happens in the BIOS AHCI mode. I changed it to the compatibility mode (PATA/SATA) and I could install with edd active and everything went smooth. The only problem is that only 4 HDs are recognized by the BIOS and linux. So I lost the usage of 2 HDs.

Therefore, it seems to be a BIOS bug or a linux kernel bug or lack of hardware support. The BIOS was the newest available in Intel page the time of the first post.

As soon as I get some spare time, I will reinstall this machine trying one of these options:

  • load the RAID BIOS drivers and work with RAID by BIOS instead of software RAID by linux,
  • install the boot partition on a pair of pen drives in RAID1, as all USB mass storage can be booted first of everything by this BIOS.

Any suggestions?

plage:
I thought that would be my next try, but unfortunately, I actually have 5 drives total connected :(, and as soon as I disable ahci, my DVD is gone. I suppose linux is already installed, so I could see if it boots. As for Trying to use the onboard raid bios, well that is what I initially tried. In my experience, grub complains when attempting to install on the RAID, and hence I still can’t boot (the best :wink: thing I find about the opensuse install is that grub is installed at the END, and so you have to wait for the install to complete, before you can figure out if you can boot it). I also noticed that opensuse only saw (in my 4 drive configuration) 1/4 of the total drive space in Raid 10, instead of 1/2, but it did see all the drive space in Raid 0 mode.

I also noticed weird drive ordering (when installing without a raid defined, or in AHCI mode with PATA active, where SATA0 seemed to correspond with /dev/sdd, and SATA1 /dev/sda etc etc…

I’ll play around some more, and let you know what I find I guess, although I’m getting ready to give up, and request a different motherboard.

grrr…(that is my frustrated growl, although it probably sounds more like a little kitten :slight_smile:

caf4926:
I wasn’t changing the boot order in the BIOS, they just happen to have their boot manager there, where you can specify what device to boot from immediately. (on most machines/laptops, this is the same then pressing esc, or f12 (or whatever), and have the machine bypass trying the boot order sequentially, and just asking which device you want to boot from).

goonieg:

The BIOS RAID only works correctly with the software driver that comes with this motherboard. Presently, they only have suport for Suse 10.2. I would have to downgrade the OS then :’(.

Since I am stuck with this motherboard, I will probably install using /boot in two fast pen-drives in RAID1 and with the 6 HDs using AHCI. I hope this works :.

caf4926:

The BIOS of this motherboard only specifies the boot device as “hard disk”. If it were only one, it seems that it would be possible to boot using this option.

Your quote

installation identified the 6 HDs, I am not using BIOS RAID

6 HD’s

so which HD did you install grub too?

what is on the HD’s

Later you said:

GRUB does not know where to boot from.

It can only do what you tell it and of course it can only do that if it has boot priority from the MBR of the first HD in the BIOS boot setup.

Can you do a
fdisk -l
maybe from a live cd if you don’t have a working linux OS

caf4926:

Under AHCI, only with the option edd=off the installation starts. Does it install GRUB? I set it to install in the MBR of the first HD, but I do not know if it has actually been installed and in which HD.

I have found out that the option edd=off implies that the linux will not try to guess the HD order from the BIOS. So there is no point in searching BIOS HD order.

No other linux distribution seems to start installation (or boot live) with this BIOS using AHCI.

By the way, the system after installation is fully operational as long as you do not reboot. As I said, I will give this machine another try when I have time. If anyone has some hints of how to install GRUB by hand and/or in all disks at the same time, I would appreciate.

Hi
I’m not sure if this may be relevant,but what I found with my SLED
system was the SATA controller screen printing was reversed to how
linux read the drives when multiple disks are used. So in my case,
sda=sata connector 4 sdb=sata connector 2. I wanted to use the separate
controller chips for redundancy, I have my home drives on the RAID
controller as JBOD with software RAID.

Can you physically check serial numbers versus controller positions
then use the one in sda position to boot from in the BIOS.


Cheers Malcolm °¿° (Linux Counter #276890)
openSUSE 11.0 x86 Kernel 2.6.25.18-0.2-default
up 1:08, 1 user, load average: 1.59, 0.60, 0.25
GPU GeForce 6600 TE/6200 TE - Driver Version: 177.80

Common problems switching to AHCI under Linux

  • AHCI controller does not work on AMD/ATI RS400-200 and RS480 HBA when MSI is enabled due to a hardware error. In order for AHCI to work users must provide the “pci=nomsi” kernel boot parameter. With MSI disabled in this way, the PCIe bus can only act as a faster PCI bus with hotplug capabilities. This is also true of the Nvidia nForce 560 chipset.citation needed
    ]> - AHCI controller on AMD/ATI SB600 HBA can’t do 64-bit DMA transfers. 64-bit addressing is optional in AHCI 1.1 and the chip claims it can do them, but in reality it can’t, so it is disabled. After that it will be forced to do 32-bit DMA transfers. Thus DMA transfers will occur in the lower 4 GiB region of the memory, and bounce buffers must be used sometimes if there is more than 4 GiB of RAM.[6]](Advanced Host Controller Interface - Wikipedia)
  • The VIA VT8251 South bridge suffers the same fate but it can be circumvented with the “pci=nomsi” option to force detection of the chip. This has been tested to work on 2.6.26, 2.6.24 and 2.6.20 kernels.

The reason it does not work after you re-boot is: “you don’t have grub in the right place or you don’t have the right HD set as 1st boot in BIOS”

My laptop was set as AHCI by default. When I first got it, I wiped the HD and planned to install XP, but XP could not install under that BIOS setting, so I changed it to ‘compatability’ - and it was fine. Incidentally, Suse did not mind either way what setting it was.

If you are sure you know which HD you set grub to - Great. Just make sure it is set first in BIOS. If what you said is true:

I have found out that the option edd=off implies that the linux will not try to guess the HD order from the BIOS. So there is no point in searching BIOS HD order.

it implies that what appears to be a kernel option is applied before grub. (I’m not sure how that can be)

Earlier you said

I changed it to the compatibility mode (PATA/SATA) and I could install with edd active and everything went smooth. The only problem is that only 4 HDs are recognized by the BIOS and linux. So I lost the usage of 2 HDs.

So did it boot from Grub?
Consider working on ‘compatability’ if it did/does work.
As I have only ever had 3 HD’s at one time - I’m unsure of the implications of your 6

caf4926:

I have not tried "pci=nomsi"under installation, although the motherboard is different from those where the problem has appeared before.

The option “edd=off” is for enable installation from the OpenSuse 11.0 DVD when AHCI is enable. If you do not put that, you get a black screen n the beginning of installation. If you put this option, the installation starts and it seems to work fine, but if a try a reboot, the machine cannot boot. There are two possibilities:

  • GRUB was not installed. How do I verify its installation?
  • GRUB is installed, but in a HD that is not the one Linux, using the “edd=off” option, is guessing to be the boot HD. How can I map under Linux the HDs, maybe by serial number, as malcolmlewis suggested? If the correct HD can be identified, it would be possible to re-install GRUB by hand in this HD. How do I do that?

Presently, the machine was switched to compatability mode and OpenSuse 11 is installed on 4 HDs. Linux boots with GRUB.

Do I need the 6 HDs? Yes, I bought this machine with 6 HDs because I do need a lot of space and some redundancy. If I can use the 6 HDs, I will install / in 4 HDS, with /home as RAID5 and / as RAID1 with a spare and the other two will be used as RAID0 (/scratch) to speed up IO during computations with heavy IO overhead. Therefore, The solution with 4 HDs is unning, but it is not enough for this machine.:frowning:

Hey…

I had a similar problem, I tried to install Windows XP SP2 (i know i am on the wrong forum but this could help) on the same motherboard.

I have 5 SATA HDD and 1 SATA DVD-RW and 2 PATA DVD-RW.

I was googling on this problem and i saw your post.

If you haven’t installed it yet (as you wished, 6 separate HDD-s) this could help you.

You need to intall the OS over the Intel Deployment Assistant (CD you got with your MB, or you DL it from intel Web, it’s a bootable CD with some stuff)

The thing is, it helped me, i see all the drives. Hope it helps you.

Good Luck!