[PlanetCCRMA] PlanetCCRMA and RAID: which way to go?

Fernando Lopez-Lezcano nando@ccrma.Stanford.EDU
Wed Jun 22 17:18:02 2005


On Wed, 2005-06-22 at 16:41, Henk Jansen wrote:
> On Wed, 2005-06-22 at 11:12 -0700, Fernando Lopez-Lezcano wrote:
> > On Wed, 2005-06-22 at 04:32, Henk Jansen wrote:
> > > On Wed, 2005-06-22 at 11:45, Steve Harris wrote:
> > > > Digging around in the recesses of my memory, that might be because you
> > > > need an initrd (a thing that can load certain drivers that are needed
> > > > before kenel boot finises. Look on the web for instruction about how to
> > > > build one, you will nee the raid and md drivers in your initrd if there
> > > > not already.
> > > 
> > > The planetccrma distribution comes with precompiled initrd-*'s and
> > > vmlinuz-*. 
> > 
> > Hmmm, you should not need to "recompile" them. The initrd image used
> > during boot is built at kernel install time using the information about
> > peripherials that is part of /etc/modprobe.conf. 
> 
> Using the default Fedora Core 3 kernel (2.6.11-1.14_FC3), the raid0
> module is loaded and everything is working fine. Yet there's no
> mentioning of "raid" of any sorts in the /etc/modprobe.conf file.
> 
> The planetccrma software was installed from cdrom using the "apt-get
> install ..." procedure --- I asume this is the correct procedure
> (including the build of a proper kernel)...

Yes, that is correct. This has to be a mismatch between drivers between
the different kernels, perhaps. 

> > So, it could be that at
> > the time of the install the disk controller was not online and then its
> > driver did not get added to the initrd image. You could rebuild the
> > image by using mkinitrd ("man mkinitrd" for details). 
> 
> This may be it (...or not...): reading the man page for mkinitrd it says
> that
> <quote>mkinitrd automatically loads filesystem modules (such as ext3 and
> jbd), IDE  modules, all scsi_hostadapter entries in /etc/modprob.conf,
> and raid modules if the system˙s root partition is on raid,
> which</quote>...

Raid itself should have been loaded because you have a configuration
file that describes somewhere in /etc (don't remember the name). 

> My system's root partition is on a normal harddrive device *not* on a
> raid device (only my /home directory is on a raid device). May therefore
> the planetccrma installation procedure have missed to include raid
> support... (the raid0 module however lives in the /lib/modules for the
> various planetccrma kernels).
> 
> > What is the raid device hanging from? 
> 
> I don't understand what you mean. Please, can you explain a bit more?

I meant, what is the controller to which the drives are attached. It
seems from the logs that you are using raid0 (stripping, no redundancy).

> > Are the disks themselves and their
> > controller recognized during the boot process? I take it that raid is
> > recognized by the original Fedora kernel, right?
> 
> Indeed. These are the messages during a boot process with the original
> Fedora kernel:
> 
> Jun 22 00:51:35 jaki smartd[3936]: Device: /dev/hda, found in smartd
> database.
> Jun 22 00:51:35 jaki kernel: Freeing unused kernel memory: 180k freed
> Jun 22 00:51:36 jaki kernel: md: raid0 personality registered as nr 2
> Jun 22 00:51:36 jaki smartd[3936]: Device: /dev/hda, is SMART capable.
> Adding to "monitor" list.
> Jun 22 00:51:36 jaki kernel: md: Autodetecting RAID arrays.
> Jun 22 00:51:36 jaki smartd[3936]: Device: /dev/hde, opened
> Jun 22 00:51:36 jaki kernel: md: autorun ...
> Jun 22 00:51:36 jaki kernel: md: considering hdg1 ...
> Jun 22 00:51:36 jaki kernel: md:  adding hdg1 ...
> Jun 22 00:51:36 jaki smartd[3936]: Device: /dev/hde, found in smartd
> database.
> Jun 22 00:51:36 jaki kernel: md:  adding hde1 ...
> Jun 22 00:51:36 jaki kernel: md: created md0
> Jun 22 00:51:36 jaki kernel: md: bind<hde1>
> Jun 22 00:51:36 jaki kernel: md: bind<hdg1>
> Jun 22 00:51:36 jaki kernel: md: running: <hdg1><hde1>
> Jun 22 00:51:36 jaki kernel: md0: setting max_sectors to 128, segment
> boundary to 32767
> Jun 22 00:51:36 jaki kernel: raid0: looking at hdg1
> Jun 22 00:51:36 jaki kernel: raid0:   comparing hdg1(58615552) with hdg1
> (58615552)
> Jun 22 00:51:36 jaki kernel: raid0:   END
> Jun 22 00:51:36 jaki kernel: raid0:   ==> UNIQUE
> Jun 22 00:51:36 jaki kernel: raid0: 1 zones
> Jun 22 00:51:36 jaki kernel: raid0: looking at hde1
> Jun 22 00:51:36 jaki kernel: raid0:   comparing hde1(58615552) with hdg1
> (58615552)
> Jun 22 00:51:36 jaki kernel: raid0:   EQUAL
> Jun 22 00:51:36 jaki kernel: raid0: FINAL 1 zones
> Jun 22 00:51:36 jaki kernel: raid0: done.
> Jun 22 00:51:36 jaki kernel: raid0 : md_size is 117231104 blocks.
> Jun 22 00:51:36 jaki kernel: raid0 : conf->hash_spacing is 117231104
> blocks.
> Jun 22 00:51:36 jaki kernel: raid0 : nb_zone is 1.
> Jun 22 00:51:36 jaki kernel: raid0 : Allocating 4 bytes for hash.
> Jun 22 00:51:36 jaki kernel: md: ... autorun DONE.
> Jun 22 00:51:36 jaki kernel: kjournald starting.  Commit interval 5
> seconds
> 
> Then partitions hde1 and hdg1 are clearly recognized as raid devices.
> What I obtain when trying to boot with the planetccrma kernel is written
> below in my first posting.

Once you get into rescue mode you could dump the kernel messages to a
file (probably / is rw at this point) with a "dmesg>file_name". It would
be interesting to see if the devices (hard drives) are being found
during the boot process of the new kernel. 

-- Fernando