Software RAID1 (starts degraded), Debian Sarge 3.1 (stable), kernel 2.4.27, LILOAbsolutely no warranty, use entirely at your own risk, the author accepts no responsibility. It is quite likely some statements in this document are inaccurate. This HOWTO will not work on a system that uses udev. Etch and Sid use udev. This document should work on kernels 2.6.8 and older (provided the system has not migrated to udev), but will probably not work on newer 2.6 kernels. Everything considered, you have about a 1 in 3 chance this will actually work. If you should ever want to install programs from 'testing' that depend on a recent version of libc6, installing libc6 from 'testing' may remove your kernel! Oh joy, I have a need to create a RAID1 array on an SMTP/IMAP server, and because all of our company's email will is stored on that server, I must not loose the data. I am going to use software RAID1 to mirror the hard drive and make a tape backup once each day. I experimented (on a test system) for days using bits and pieces of the many HOWTOs I found on the subject of RAID1 mirroring on Linux (and specifically Debian) and I found that there is a lot of stuff that did not work for me or is ancient history. Currently there is a another shift in the tools used that affect how RAID functions, so in a sense this document will also soon be history. I am not an expert on the subject by any means, but I have learned a few things: If you use this document on anything but Debian Sarge 3.1, you might loose all your data. If you don't understand the reason you are performing a step, you might loose all your data. If you don't know exactly what will happen when you perform a step, you might loose all your data. If you blindly copy and paste commands in this document to your shell, you might loose all your data. If you are not paying attention and are not free from distraction, you might loose all your data. If you have not first practiced on a test system, you might loose all your data. If you make typo errors, you might loose all your data. If you do not first back up your data, you might loose all your data. If you don't perform the steps in the proper order, you might loose all your data. If you become impatient, you might loose all your data. If you don't document how to repair your system in the event of a hard drive failure, you might loose all your data. Other than that, it's really pretty simple. The easiest way to get RAID1 functionality is to configure it using the 'partman' partition manager and 'mdcfg' presented to you when you first install Debian. Of course this is only useful if you are building a new system. This document however is for a system that is currently up and running on a single drive and you wish to add a second drive to mirror the first. If you are building a new system and wish to configure your drives as a RAID1 array or you are using the GRUB boot loader, this document may not be for you. See http://verchick.com/mecham/public_html/raid/raid-index.html for other choices. This document in itself is not designed to get RAID1 functional on your production computer. It is designed to get you comfortable with doing so in a test environment. The test environment you create should be as close as possible to the system you will eventually configure RAID on. A few of the steps I perform definitely should not be performed on a production system with data on it. We may do things to illustrate or prove a point; to educate us. When we finish training ourselves on a test system we should be confident enough to continue on to our production box. My setup is on an i386 based machine. I am using the 2.4.27 kernel, the ext3 file system and the LILO boot loader. There is a major problem with getting RAID1 to function. The software modules that are needed to read the data from the devices in the array need to be loaded at boot time, or the devices cannot be read. The problem is, these modules are not normally included in the boot ramdisk image (/boot/initrd.img-x.x.x). For our purposes we need two modules, 'md' (multi-disk) and 'raid1' (redundant array of inexpensive/independent disks, level 1). This is very similar to the problem Windows administrators face when dealing with device drivers for hard disk controllers that are not included with Windows. You have to install the device driver on the hard drive before you install the controller, or you cannot read your hard drive. While there is evidence it may be possible, I know of no straightforward way to include the needed modules in a series of boot floppies. The bottom line is, if you cannot get the modules loaded into the ramdisk, you may not be able to boot your machine from your hard drives. Also, it is not enough to get the modules into the boot ramdisk image. Doing so will get one or two RAID devices running but the remainder depend on additional RAID software that loads later on in the boot process. In addition, you must get your system configured in such a way that BOTH hard drives are bootable so you can boot your system from either drive if the other drive has failed, or is removed. I have to admit that I don't understand (whether Windows or Linux) how the disk is read if the software needed to read the disk is on the disk! It must be some special low-level boot magic. I suggest reading: man initrd
I am going to talk about both SCSI and EIDE hard drives because I have tested this with both, but the examples will be SCSI. There are not a lot of differences. Simply substitute 'sda' with 'hda', 'sdb' with 'hdc' and such. Sorry, I have not tested with SATA drives. You will need two identical hard drives to complete this project (in addition to the one currently installed in your system). They can actually be different (within reason) but if they differ in size, the smallest drive will dictate the size of the partitions, so the drives must be equal in size or larger than your current hard disk. Identical is better. Why two additional drives you say? I thought RAID1 only used two drives. True, but once you install one of the drives in your production system we want to have a spare drive available if one of the other two drives fails. That is the point isn't it? Besides, we are going to use the two spare drives as test drives prior to installing one of them in the production machine. Name your hard drives so it is easier for me to refer to them. Actually label these names on them. Name one of them apple, and one of them pie. If one of the drives is smaller than the other, label the smaller of the two apple. For an EIDE system, one drive must be installed on the primary master drive connector, and the other on the secondary master drive connector. I am going to refer to the EIDE drive that is connected to the primary master connector as being in the primary position. Linux should recognize this drive as /dev/hda. I am going to refer to the EIDE drive that is connected to the secondary master connector as being in the secondary position. Linux should recognize this drive as /dev/hdc. For a SCSI system with both drives on one SCSI adapter, one drive should be configured as SCSI device id 0 (on my system removing all ID jumpers on the drive sets it to 0), and the other drive is typically configured as SCSI device id 1 (I add a jumper to the drive to set this). I am going to refer to the SCSI drive that is configured as SCSI device id 0 as being in the primary position. Linux should recognize this drive as /dev/sda. I am going to refer to the SCSI drive that is configured as SCSI device id 1 as being in the secondary position. Linux should recognize this drive as /dev/sdb. These statements assume both drives are installed. If only one drive is installed, it will be in the primary position and be recognized as /dev/sda regardless of the jumper setting. For a SCSI system with each drive on separate SCSI adapters, both drives are typically configured as SCSI device id 0 (but I prefer to set the one on the second adapter as SCSI device id 1). You may need to determine which adapter is recognized first, and which is recognized second. If the adapters are the same model by the same manufacturer this is a more difficult task. You may have to temporarily remove one of the drives to see which adapter is recognized first. Then you may want to label them. I am going to refer to the SCSI drive that is on the adapter that is recognized first as being in the primary position. Linux usually recognizes this drive as /dev/sda. I am going to refer to the SCSI drive that is on the adapter that is recognized second as being in the secondary position. Linux should recognize this drive as /dev/sdb. These statements assume both drives are installed. If only one drive is installed, it will be in the primary position and be recognized as /dev/sda regardless of which adapter it is installed on. All the data on the two drives we use for testing will be erased. Any drive that may be used in the future to replace a drive that has failed MUST be clean. Any drive that has been used in a RAID array at any time in the past must also be cleaned. Let me explain why. Let's pretend for a moment that we used apple in a RAID array, then unplugged it and replaced it, then put it away for later use as an emergency replacement. A year from now one of our drives goes bad, so we shut the machine down and place apple in its place. Then we boot up, and to our horror, the good drive has synced itself to the data stored on apple and not the other way around. To clean a drive, install it in a system by itself, and boot up using a DBAN disk. You can change the method to Quick erase. This will write zeros to each bit on the disk. You should also have a tomsrtbt disk (or other suitable rescue disk) available. Install a cleaned apple in the primary position and leave pie out of the computer. A good place for the CDROM drive is the secondary slave EIDE interface. Boot up using the appropriate Debian installer media. I use the stable netinst CD. Install kernel 2.4 by using 'linux' at the initial installation screen. If your production system is using the 2.6 kernel then you should install that instead (linux26 instead of linux). This illustration shows what the end product of my test machine will look like. Some people use a separate /boot partition and I have only tested this setup with one in place. When installing Debian on your system you should set the partitions up in the same manner as your production box so you can gain experience with something closer to your setup. I will not detail installing Debian, but I will say that since this is just a test system you will only need to install the absolute minimum number of software packages. When using the partition manager you do not want to configure RAID in any way. When you are prompted to install the GRUB boot loader, instead choose <Go Back> then choose "Install the LILO boot loader". If the installer says Lilo installation has failed, choose [Yes] to continue anyway. Please take notes on how your disk is configured: device mount md-device temp-mount boot partition-type /dev/sdb1 /boot /dev/md0 /mnt/md0 * primary /dev/sdb5 swap /dev/md1 logical /dev/sdb6 / /dev/md2 /mnt/md2 logical Continue on with the Debian installer until you get to the point you can log in as root. If you need a tomsrtbt disk, insert a blank floppy, then: cd /usr/local/src
You may also be able to Google for an iso image that you can use to create a bootable tomsrtbt CD - http://www.doctort.org/adam/nerd-notes/tomsrtbt-on-a-cd.html. Now you can remove apple, install pie in its place and clean it per the instructions mentioned earlier. Then place apple back in the primary position and place pie in the secondary position. Back up a few files. Your initrd.img may be a different version. If so, I suggest saving this document to your computer and doing a search and replace of the kernel version number:
cp /etc/fstab /etc/fstab-backup
I personally need my vim, and we need to install our main program, mdadm (multi-disk administrator). Obviously you may choose to use a different editor. When installing mdadm it is imperative you answer [YES] to both questions: Do you want to start the RAID devices automatically? and Do you want to start the RAID monitor daemon?
apt-get install vim initrd-tools mdadm
cat /etc/default/mdadm
Should show something like:
# This file is automatically generated.
Now we will include the needed modules in the ramdisk image. We start by modifying a file that the compiler uses: vim /etc/modules
and insert at the end of the list of modules:
md
Save and exit the file. This part is important to get right or our system will not boot. You need to copy all the modules listed in /etc/modules to /etc/mkinitrd/modules that deal with our hard disk drives, motherboard chipset and raid (in the same order thay are listed in /etc/modules). You would not need to include drivers (modules) that obviously deal with things like the CDROM drive or mouse. If you are not certain, then it is better to include it. If our hard drives are not recognized prior to our md devices, our system will not boot. You might see modules like 'ide-detect', 'ide-disk', 'ide-scsi' and others. Essentially copy all the modules to /etc/mkinitrd/modules, then remove any that do not pertain to our hard drives (psmouse, ide-cd):
grep -vE '^$|^#' /etc/modules >> /etc/mkinitrd/modules
Once you have added (and possibly removed) modules there, save and exit the file. Now we make the new initrd.img. We actually end up doing this three or four different times during this setup (because our system will be going through changes). The mkinitrd program may not want to directly replace the file we are using, so we compile to a temporary file, then copy it over the top of our current initrd.img. As noted before, edit this to reflect your system if needed.
mkinitrd -o /boot/initrd.img-2.4.27-2-386-temp /lib/modules/2.4.27-2-386/
LILO has a menu system that can be used to choose between different configurations but it is not active by default. We want to activate it. Edit lilo.conf:
vim /etc/lilo.conf
Scroll down to this section and uncomment and configure the 'prompt' and 'timeout' setting as I show here. The 'prompt' setting provides the menu mechanism, and 'timeout' is required so it will automatically make a choice for us after 5 (50 tenths) seconds. prompt # delay=100 timeout=50For reference, my lilo.conf at this point with comments removed: boot=/dev/sda root=/dev/sda6 map=/boot/map delay=20 prompt timeout=50 vga=normal default=Linux image=/vmlinuz label=Linux read-only initrd=/initrd.img image=/vmlinuz.old label=LinuxOLD read-only optional initrd=/initrd.img.oldSave and exit the file, then create a new MBR. lilo
If you fail to see the word "Added" at least once, you are in trouble. Warnings are not fatal. In order to load the new initrd.img file into memory, we must reboot: reboot
When the system comes back up run 'cat /proc/mdstat' to see if we now have a system capable of using a RAID array: cat /proc/mdstat
The resulting output on my machine:
Personalities : [raid0] [raid1] [raid5]
If [raid1] is not shown here then you are not loading the needed modules. You cannot continue past this point until you are. Now we will copy the partition structure from apple to pie. This is one of those things you must be careful doing because this will destroy all data on the target disk. Since we have already cleaned the target disk we should not have to --force this to work. The target disk must be of equal or greater size than the source disk. Make sure the command reflects what you want to accomplish: Run 'df' to first make sure we are currently using the disk you think we are: df
Here we copy the structure from /dev/sda to /dev/sdb: sfdisk -d /dev/sda | sfdisk /dev/sdb
Now we will use cfdisk to edit the partition table on pie (in the secondary position) and change the partition types to "Linux raid autodetect". This may also destroy all data on a disk so be careful you are editing the correct disk. To change the partition type, first use up and down arrows to select a partition, then left and right arrows to select [Type] from the menu. Press [Enter] to change the type. The type you want is 'FD' (lower case is fine). Repeat for all partitions, then [Write] the changes, then [Quit]. Your original drive should have had a partition flagged as bootable that was copied to this target drive. Make sure you don't accidentally toggle this off. cfdisk /dev/sdb
My finished product: cfdisk 2.12p Disk Drive: /dev/sdb Size: 18200739840 bytes, 18.2 GB Heads: 255 Sectors per Track: 63 Cylinders: 2212 Name Flags Part Type FS Type [Label] Size (MB) ------------------------------------------------------------------------------ sdb1 Boot Primary Linux raid autodetect 296.12 sdb5 Logical Linux raid autodetect 501.75 sdb6 Logical Linux raid autodetect 17396.47At this point we will reboot again so our system properly recognizes the changes made to this disk: reboot
Now we can start the process of creating the degraded array. We start by doing some additional cleaning of our drive in the secondary position (pie). This is to insure there are no remnants from prior RAID installations. Zero the superblock for each of the partitions we configured as type "Linux raid autodetect":
mdadm --zero-superblock /dev/sdb1
Now we create md devices for each of our partitions, with one partition present on each md device, and one partition missing. The ones that are missing are on our primary drive (apple). We can't add these to our array at this time because those partitions are currently in use and they are not the of the type of partition we want. The general plan is to create the RAID structure on the first RAID disk (pie), copy all the data from the original disk (apple) to that RAID disk, reboot to that degraded RAID disk, then reformat the original disk and add it to our RAID array (at which time the two disks will begin to synchronize). There are obvious risks in doing this and the process is prone to error. One thing that could be difficult to keep track of is: a number of files related to RAID must of course be on the RAID drive. When we boot to the RAID drive, it must be configured as a RAID drive. Some people first copy all the data from the original drive to the RAID drive, then modify the RAID related files on the RAID drive prior to rebooting into it. Then if they have problems and need to make changes to the system they often make the mistake of trying to fix the RAID related files by editing the files on the original drive. I prefer to configure everything on the original drive and then copy the data over at the very last moment. If things get really ugly we can boot up with the tomsrtbt disk and make a few changes to the original disk to enable us to boot up into it (provided we have not reformatted it yet). Then we can make the necessary changes and copy the data over once again. Anyway, lets create the needed md devices. Edit as required and then run these one at a time:
mdadm --create /dev/md0 --level=1 --raid-disks=2 missing /dev/sdb1
If you get "No such file or directory" errors, you are probably using a 2.6.x kernel which is using udev and initramfs-tools (or yaird) instead of devfs/hotplug and initrd-tools. This document does not support udev. Sorry, but I have yet to get a degraded RAID installation to work on such a system. I have been able to repair a system that was once fully functional under devfs and initrd-tools and then was inadvertently upgraded to a 2.6.15 kernel (I show how later) but I'm not sure if a system like that is fully functional and you may not have the same luck I did. The only reason it worked is because the system did not migrate to udev. Once again, run 'cat /proc/mdstat':
cat /proc/mdstat
You should get something similar to this which displays the fact that one out of two disk devices are up [_U] for each of our md devices (and the other ones are missing). This is called 'degraded': Personalities : [raid0] [raid1] [raid5] md2 : active raid1 sdb6[1] 16988608 blocks [2/1] [_U] md1 : active raid1 sdb5[1] 489856 blocks [2/1] [_U] md0 : active raid1 sdb1[1] 289024 blocks [2/1] [_U]You may see a different format: Personalities : [raid0] [raid1] [raid5] read_ahead 1024 sectors md2 : active raid1 scsi/host0/bus0/target1/lun0/part6[1] 16988608 blocks [2/1] [_U] md1 : active raid1 scsi/host0/bus0/target1/lun0/part5[1] 489856 blocks [2/1] [_U] md0 : active raid1 scsi/host0/bus0/target1/lun0/part1[1] 289024 blocks [2/1] [_U]"target 1" here is our secondary drive (/dev/sdb), so "target 1, part6" would be the same as /dev/sdb6. If your system does not show something with a structure similar to one of the two above, then you must fix it before continuing. Now we create file systems on our md devices that match the file systems currently in use on our original devices. This also erases data on the target devices. I use ext3, and of course a swap partition:
mkfs.ext3 /dev/md0
When the system boots up into our RAID system, it should automatically assemble at least one of the md devices we created (so we can start the boot process) but it may not assemble the rest. This could result in a failure to complete to boot process. This task of reassembling the remaining devices is handled by /etc/init.d/mdadm-raid. This init script uses the command 'mdadm -A -s -a' which means: "automatically assemble all of our md devices using the information stored in /etc/mdadm/mdadm.conf". Well, we must update the information in mdadm.conf so it correctly reflects our current state (as shown by /proc/mdstat). To do so:
echo "DEVICE partitions" > /etc/mdadm/mdadm.conf
Now let's display the file this created:
cat /etc/mdadm/mdadm.conf
It should show something like this. Newer versions of mdadm will not have the "devices" lines:
DEVICE partitions
Our RAID system has changed, so once again we are going to make a new initrd.img:
rm /boot/initrd.img-2.4.27-2-386-temp
Before we attempt booting up into our md devices we are first going to do a test to insure our md devices are assembled after a reboot and that they are mountable during the boot process. First we make a mount point for each of our devices (all except the swap partition):
mkdir /mnt/md0
Run 'free' and make a note of how much total swap space you have: free
Then edit /etc/fstab and make some changes. At the bottom of the file place directives to mount each of our md devices to the mount points we created (or swap). Here is a sample:
vim /etc/fstab
and insert (edit as needed to reflect your system): /dev/md0 /mnt/md0 ext3 defaults 0 0 /dev/md1 none swap sw 0 0 /dev/md2 /mnt/md2 ext3 defaults 0 0With these lines added my /etc/fstab now looks like this: proc /proc proc defaults 0 0 /dev/sda6 / ext3 defaults,errors=remount-ro 0 1 /dev/sda1 /boot ext3 defaults 0 2 /dev/sda5 none swap sw 0 0 /dev/hdd /media/cdrom0 iso9660 ro,user,noauto 0 0 /dev/fd0 /media/floppy0 auto rw,user,noauto 0 0 /dev/md0 /mnt/md0 ext3 defaults 0 0 /dev/md1 none swap sw 0 0 /dev/md2 /mnt/md2 ext3 defaults 0 0Save and exit the file. Let's reboot and see if this works:
reboot
When the system comes up, run 'mount' to see if the devices were mounted. There is no point continuing past this point unless they were:
mount
If you run 'free' again, it should show the total swap space is twice the size it was before. Assuming you have a swap partition, you must get this working before you continue on: free
Then run 'cat /proc/mdstat' again and verify all the md devices that used to be there are still there. There is no point continuing past this point unless they are:
cat /proc/mdstat
OK. If everything is working (it must be working before you continue), now comes the scary part. Don't reboot until I tell you to. We are going to continue to configure RAID related files on our original drive, then we are going to copy all our data from the original devices to the md devices, then create a boot record on the secondary drive, then boot up using the md devices on the secondary drive instead of our original devices on the primary drive. There are a few things to think about as far as the copy process goes. The machine should not be in the middle of processing stuff, so you should drop into single user mode and possibly disconnect the ethernet cable. Because you will be in single user mode you will run the actual copy commands at the console (not remotely). You should not delay once the data is transferred and it comes time to reboot. If you successfully boot up into the md devices then be aware that the data on the original drive will soon become stale, so in the event you need to boot back into the original drive be aware that you may loose data. Hopefully you will have no need to do that and your new RAID devices will hold current data from now on. If you are able to boot up using your md devices, the scary part is over (but opportunities to destroy your system still remain). We are going to configure /etc/fstab to boot up into the md devices, we are optionally going to create another initrd.img and we are going to tell lilo to boot into our md devices. We will start by editing /etc/fstab again. We must remove (or comment out) the lines we added previously (they were just a test), then change the corresponding /dev/sda devices in /etc/fstab to /dev/md devices. vim /etc/fstab
and modify it in a similar manner to this sample. Of course the mount points must correctly correspond to appropriate md devices. Refer to the notes you should have made. My finished /etc/fstab file will look like this: proc /proc proc defaults 0 0 /dev/md2 / ext3 defaults,errors=remount-ro 0 1 /dev/md0 /boot ext3 defaults 0 2 /dev/md1 none swap sw 0 0 /dev/hdd /media/cdrom0 iso9660 ro,user,noauto 0 0 /dev/fd0 /media/floppy0 auto rw,user,noauto 0 0Modify another file that mkinitrd uses, mkinitrd.conf:
vim /etc/mkinitrd/mkinitrd.conf
and replace the ROOT=probe entry to an entry that reflects the md device that root (/) will be mounted on, and the file system in use there:
#ROOT=probe
Note that you may also want to change MODULES=most to MODULES=dep. Doing so will make your initrd.img about half its original size. I don't know what other implications there may be in doing this. I do not know if this will adversely affect your system so use at your own risk. Actually, you are doing all of this at your own risk. Once again we would update our initrd.img:
rm /boot/initrd.img-2.4.27-2-386-temp
Now we make a change to lilo.conf: vim /etc/lilo.conf
I am going to comment out the "root" parameter and change it to the md device in which we will mount root:
#root=/dev/sda6
We want to create a new LILO menu choice that provides a little protection for us should we fail to boot using our new root md device. Scroll down and make a copy of the first menu stanza (as shown next), place it below the first one, and make two changes to it. One change is to rename the label, and the other is to place our old 'root' setting in it (it will override our new default of /dev/md2). Use a [Tab] to indent the setting we add. This is what the finished product looks like: image=/vmlinuz label=Linux read-only # restricted # alias=1 initrd=/initrd.img image=/vmlinuz label=LinuxNORAID read-only # restricted # alias=1 root=/dev/sda6 initrd=/initrd.imgSave and exit the file then run 'lilo' to write a new boot record. Remember, do not reboot yet:
lilo
This is similar to what should result from running lilo (warnings do not seem to be a problem): Warning: '/proc/partitions' does not match '/dev' directory structure.
If 'lilo' fails, check for errors and try again. If you get into trouble remember that we did make a backup of this file. If you have been following this HOWTO correctly, our md devices will still be mounted to the mount points we had in /etc/fstab when we booted up. If they are not mounted for some reason (shame on you, I told you not to reboot), you will need to remount them. For example: "mount /dev/md2 /mnt/md2", "mount /dev/md0 /mnt/md0". Now we are going to copy our data. In my case I want to copy all the data in the root partition to /mnt/md2, and all the data in the /boot partition to /mnt/md0. The copy from root to the md mount point is straightforward but other mount points such as /boot are not as straightforward. For those I first change to that directory, then use the period (.) to signify "here". In other words "copy from here to there" as opposed to "copy this to that". This prevents me from copying /boot to /mnt/md0 and ending up with a /mnt/md0/boot directory instead of a /mnt/md0 directory containing all the files in the /boot directory. At the console get into single user mode (also consider stopping programs that may be writing to the disk):
init 1
then work on the copy process. All files on the disk need to get copied so use your head:
cp -dpRx / /mnt/md2
Run some tests and make sure the source and destination match for each mount point.
Fix it if they don't:
ls -al /
OK. Now comes the butterflies in your stomach. Knock on wood, throw salt over your shoulder, rub your lucky rabbit's foot, cross your fingers.
reboot
If it crashes, don't freak out just yet. Read this. If it reboots, run 'df' and check that it is in fact our md devices we are using. Run 'cat /proc/mdstat' again and insure all md devices are shown there, If all is well, we are no longer using the original drive. If all is not well, it must be fixed before we continue:
df
My df looked like this: Filesystem 1K-blocks Used Available Use% Mounted on /dev/md2 16721716 384700 15487588 3% / tmpfs 258084 0 258084 0% /dev/shm /dev/md0 279891 21434 244006 9% /bootIf you run 'free' again, (assuming you have a swap partition) it should show the total swap space is back to the original size: free
OK. Now we will reformat the original drive (apple /dev/sda) and then add it to our array. I hope everything is working great so far and all our files were successfully copied because we now must destroy all data on the original drive. Run cfdisk on the original drive and (just as we did for our secondary drive) change the type of each partition to type "FD". This is the part where (if you are working on a production box) you should have a good backup of the drive because this will destroy all the data on the original disk: cfdisk /dev/sda
Change all the partition types, then write and quit. Make sure you have not toggled off the boot flag. Now we can add the partitions on /dev/sda to our RAID array. Edit this to suit your system. Do this one at a time:
mdadm --add /dev/md2 /dev/sda6
Now you will just have to WAIT until the disks synchronize. NEVER REBOOT while disks are synchronizing. You can monitor the progress with:
watch -n 6 cat /proc/mdstat
Mine looks like this after a while. Notice we now are using both drives and md2 has fully synced: Personalities : [raid0] [raid1] [raid5] md0 : active raid1 sda1[2] sdb1[1] 289024 blocks [2/1] [_U] [=>...................] recovery = 6.7% (19520/289024) finish=0.2min speed=19520K/sec md1 : active raid1 sda5[2] sdb5[1] 489856 blocks [2/1] [_U] resync=DELAYED md2 : active raid1 sda6[0] sdb6[1] 16988608 blocks [2/2] [UU] unused devices: <none>Of course, it's [Ctrl]+c to cancel 'watch'. Once the sync has completed (and not until then), we need to tell mdadm.conf about our new drives, and make another initrd.img (for the last time):
echo "DEVICE partitions" > /etc/mdadm/mdadm.conf
This should now show all our devices are present:
cat /etc/mdadm/mdadm.conf
For the last time:
rm /boot/initrd.img-2.4.27-2-386-temp
In order for our system to boot off the md device and also be able to boot off either drive we need to update lilo.conf one more time and run 'lilo' again:
vim /etc/lilo.conf
Comment out the "boot" setting and add a new one that points to our bootable md device (mine is /dev/md0), also add a raid-extra-boot setting that instructs lilo to install a boot block on additional devices (see man lilo.conf):
#boot=/dev/sda
Then run 'lilo' to write a boot record to all three devices:
lilo
This is what resulted on my system: Warning: '/proc/partitions' does not match '/dev' directory structure. Name change: '/dev/scsi/host0/bus0/target0/lun0/disc' -> '/dev/sda' Added Linux * Added LinuxNORAID Skipping /vmlinuz.old The boot record of /dev/md0 has been updated. Warning: /dev/sda is not on the first disk The Master boot record of /dev/sda has been updated. The Master boot record of /dev/sdb has been updated.Reboot one more time to make sure everything is cool:
reboot
When it comes back run 'cat /proc/mdstat' once again:
cat /proc/mdstat
If all is well then we need to edit lilo.conf and run 'lilo' one last time. You should never boot back into a non-raid partition (such as /dev/sda6). Doing so would corrupt the data in our raid array so we are simply going to remove that choice from the menu. Do so by commenting out the LinuxNORAID section: #image=/vmlinuz # label=LinuxNORAID # read-only ## restricted ## alias=1 # /root=/dev/sda6 # # initrd=/initrd.imgNote: do not mess with the LinuxOLD section. LILO is not using it now, but it will when a kernel upgrade is performed. Now run 'lilo':
lilo
Your system is complete. I would reboot one more time just to make sure it comes up. OK, now I'm going to simulate a failed drive. I don't recommend you try this (your system may explode), but at least you can learn from my system. I am carefully going to remove the power cable from the primary drive, apple. Once I do this, it will be "dirty" and should not be used again in this system without first being cleaned. This is what mdstat shows as a result: sda1 and sda5 still show they are up because we have not had any read/write operations on them recently, sda6 shows it has failed (Faulty). md0 : active raid1 sda1[0] sdb1[1] 289024 blocks [2/2] [UU] md1 : active raid1 sda5[0] sdb5[1] 489856 blocks [2/2] [UU] md2 : active raid1 sda6[2](F) sdb6[1] 16988608 blocks [2/1] [_U]If your hardware supports hot swappable drives I think you should mark the remaining two devices faulty (since they actully are on a failed drive), then use mdadm to remove all three faulty devices from our array before inserting the new drive. You cannot use "mdadm --remove" on devices that are in use, so they need to be set as faulty first. You do not need to do this if you are going to power down the system and replace the drive with a clean drive. Make doubly sure you are failing the partitions on the drive that has failed!
Shut it down: shutdown -h now
For consistency (and to keep my sanity) I always move the good drive to the primary position (if it is not already there) and place the new clean drive in the secondary position. We have shut down, so disconnect the good drive, clean apple, move pie (the good drive) into the primary position, place the cleaned apple in the secondary position and bring the system back up. On my system all I have to do to swap the two SCSI drives is to move the jumper from one drive to the other. OK, my system did boot up. First we see what's going on (cat /proc/mdstat). As you can see, sdb1, sdb5 and sdb6 are missing: md0 : active raid1 sda1[1] 289024 blocks [2/1] [_U] md5 : active raid1 sda5[1] 489856 blocks [2/1] [_U] md2 : active raid1 sda6[1] 16988608 blocks [2/1] [_U]We start by copying the partition structure from /dev/sda to /dev/sdb. We do this for what should now be an obvious reason: the secondary drive is empty, but it needs to have the same structure as the primary drive. If the disk was first cleaned, and is large enough, you should have no errors:
sfdisk -d /dev/sda | sfdisk /dev/sdb
We make sure the superblocks are zeroed out on the new drive (as always, be careful you do this to the correct drive). Edit as needed: mdadm --zero-superblock /dev/sdb1
Now we add our three sdb partitions to the corresponding md's. Understand what you are doing here before you do it, edit as needed:
mdadm --add /dev/md0 /dev/sdb1
Watch them sync: watch -n 6 cat /proc/mdstat
Once the recovery is complete (and not until then), run 'lilo' again to create boot records on all three boot devices:
lilo
We are working again. You might want to reboot to insure there are no problems. Here is a story from the trenches. Somehow I was in a situation where I had not written a boot block to /dev/sda and I had cleaned /dev/sdb so now I could not boot off of either drive. I thought I had lost everything. I disconnected the primary drive and reinstalled Debian and LILO on the secondary drive (now temporarily in the primary position). Then I put the primary drive back into the computer and moved the drive with the fresh Debian install to the secondary position. To my surprise the computer booted up and used the file system on the primary drive (the remaining good raid drive). Now I could reformat the secondary drive and add it back into the array. Of course then I repaired lilo.conf and ran 'lilo' once the drives had synced. You should never try this next step on a production system because it will trash your array. You need to prove to yourself that each drive will boot up when it is the only drive in the system so you should boot up using each drive with the other one missing. As soon as a drive boots up, log in and run 'shutdown -h now' to shut it back down. Then try the other drive. Then if you care in the least about the integrity of the data on the system you should clean one of the drives and install it just as you would if you were replacing a failed drive. It's not a good idea to fire up the system using both drives if each drive has been started independently. In A RAID system it is a good idea to avoid kernel version upgrades (security upgrades should be performed of course). I also don't recommend upgrading from a 2.4 kernel to a 2.6 kernel but we are just experimenting here. Installing a kernel newer than 2.6.8 may replace devfs with udev. If this happens you could be in big trouble. I have not been able to repair a system once it migrates from devfs to udev. Some of the new 2.6 kernels (from 2.6.12) no longer use mkinitrd to create the initrd.img. The set of programs now used to create the ramdisk image (initramfs-tools - run 'man mkinitramfs') for some reason may not create an initrd.img that is able to boot into our md devices. As a result, after an upgrade to one of the newer kernels, your system may not boot to the new kernel. This bug may be resolved at some point in the future (and may be resolved when you read this) but to work around the problem I was able to use mkinitrd to create the ramdisk image. The use of mkinitrd is deprecated when used with kernels 2.6.12 or newer. I don't know why this worked for me. I would not be surprised if it does not work for you. I am using Sarge but as a test I will install a kernel from 'testing' (Etch) to illustrate. I installed a 'testing' source in /etc/apt/sources.list and ran 'apt-get update'. We need a newer version of initrd-tools that can be used with the newer kernel:
apt-get -t testing install initrd-tools
Now I determine the correct kernel for my architecture:
apt-cache search linux-image
I am going to install: apt-get install linux-image-686
Running this command installed version 2.6.15-1-686 on my system. At this point the kernel may not boot into our md devices. You can try if you like to see if the bug has been fixed. If not then you will have to boot up using the old kernel (LinuxOLD). Then we will again compile a new /boot/initrd.img using mkinitrd:
cp /boot/initrd.img-2.6.15-1-686 /boot/initrd.img-2.6.15-1-686-backup
mkinitrd -o /boot/initrd.img-2.6.15-1-686-temp /lib/modules/2.6.15-1-686/
It is required to run 'lilo' again:
lilo
Reboot to make sure it works. References (alphabetical order). Not all of these are good, but all were interesting to me in one way or another. Trust me, there are a lot more documents similar to these out there: http://alioth.debian.org/project/showfiles.php?group_id=30283&release_id=288 http://deb.riseup.net/storage/software-raid/ http://forums.whirlpool.net.au/forum-replies-archive.cfm/471585.html http://nepotismia.com/debian/raidinstall/ http://nst.sourceforge.net/nst/docs/user/ch14.html http://piirakka.com/misc_help/Linux/raid_starts_degraded.txt http://thegoldenear.org/toolbox/unices/server-setup-debian.html http://togami.com/~warren/guides/remoteraidcrazies/ http://www.debian-administration.org/articles/238 http://www.debian-administration.org/users/philcore/weblog/4 http://www.doorbot.com/guides/linux/x86/grubraid/ http://www.epimetrics.com/topics/one-page?page_id=421&topic=Bit-head%20Stuff&page_topic_id=120 http://www.james.rcpt.to/programs/debian/raid1/ http://www.linuxjournal.com/article/5898 http://www.linuxsa.org.au/mailing-list/2003-07/1270.html http://www.linux-sxs.org/storage/raid_setup.html http://www.parisc-linux.org/faq/raidboot-howto.html http://www.tldp.org/HOWTO/Software-RAID-HOWTO.html http://trinityhome.org/misc/bootable-raid1.html http://www.vermittlungsprovision.net/367.html http://xtronics.com/reference/SATA-RAID-debian-for-2.6.html |