Difference between revisions of "Mirroring Disks"

From Peter Pap's Technowiki
Jump to: navigation, search
(CentOS/RedHat)
 
(22 intermediate revisions by the same user not shown)
Line 36: Line 36:
  
 
   metainit -f d10 1 1 c1t0d0s0
 
   metainit -f d10 1 1 c1t0d0s0
   metainit -f d10 1 1 c1t0d0s1
+
   metainit -f d11 1 1 c1t0d0s1
   metainit -f d10 1 1 c1t0d0s3
+
   metainit -f d13 1 1 c1t0d0s3
   metainit -f d10 1 1 c1t0d0s4
+
   metainit -f d14 1 1 c1t0d0s4
   metainit -f d10 1 1 c1t0d0s5
+
   metainit -f d15 1 1 c1t0d0s5
   metainit -f d10 1 1 c1t0d0s6
+
   metainit -f d16 1 1 c1t0d0s6
 
   metainit d0 -m d10
 
   metainit d0 -m d10
 
   metainit d1 -m d11
 
   metainit d1 -m d11
Line 92: Line 92:
  
 
   installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c1t1d0s0
 
   installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c1t1d0s0
 +
 +
'''NOTE:'''  If you get this error with the installboot command
 +
 +
  dd: unrecognized operand `oseek'=`1' Try `dd --help' for more information.
 +
 +
then it is using the wrong version of 'dd'.  Change  your PATH to put /usr/bin/dd first in your PATH.
  
 
11. Disable the following in /etc/inetd.conf
 
11. Disable the following in /etc/inetd.conf
Line 107: Line 113:
 
== CentOS/RedHat ==
 
== CentOS/RedHat ==
  
'''DON'T USE THEE INSTRUCTIONS! I have yet to make them work properly!'''
+
OK, so software RAID on CentOS/RedHat sucks!  Unless you build the box from the outset with mirrored disk, you're going to need to do a lot of work to get to a point where you're going to be happy.  Wherever possible, mirror the disks during build!  However, if the box is already built and you need to mirror the boot disk after the fact, here's how you do it. I have tested this on CentOS 5.5.
 +
 
 +
This is a worked example that uses the following disk layout:
 +
 
 +
  df -h
 +
 
 +
  Filesystem            Size  Used Avail Use% Mounted on
 +
  /dev/sda6            3.9G  283M  3.4G  8% /
 +
  /dev/sda7              40G  251M  37G  1% /var
 +
  /dev/sda3            7.8G  764M  6.7G  11% /usr
 +
  /dev/sda2            7.8G  147M  7.3G  2% /home
 +
  /dev/sda1              99M  12M  82M  13% /boot
 +
  tmpfs                1006M    0 1006M  0% /dev/shm
 +
 
 +
  fdisk -l
 +
 
 +
  Disk /dev/sda: 73.4 GB, 73407865856 bytes
 +
  255 heads, 63 sectors/track, 8924 cylinders
 +
  Units = cylinders of 16065 * 512 = 8225280 bytes
 +
 
 +
    Device Boot      Start        End      Blocks  Id  System
 +
  /dev/sda1  *          1          13      104391  83  Linux
 +
  /dev/sda2              14        1057    8385930  83  Linux
 +
  /dev/sda3            1058        2101    8385930  83  Linux
 +
  /dev/sda4            2102        8924    54805747+  5  Extended
 +
  /dev/sda5            2102        3145    8385898+  82  Linux swap / Solaris
 +
  /dev/sda6            3146        3667    4192933+  83  Linux
 +
  /dev/sda7            3668        8924    42226821  83  Linux
  
OK, so software RAID on CentOS/RedHat sucks! Unless you build the box from the outset with mirrored disk, you're going to need to do a lot of work to get to a point where you're going to be happyWherever possible, mirror the disks during buildHowever, if the box is already built and you need to mirror the boot disk after the fact, here's how you do it:
+
  parted -l
 +
 
 +
  Model: FUJITSU MAV2073RCSUN72G (scsi)
 +
  Disk /dev/sda: 73.4GB
 +
  Sector size (logical/physical): 512B/512B
 +
  Partition Table: msdos
 +
 
 +
  Number  Start  End    Size    Type      File system  Flags
 +
  1      32.3kB  107MB  107MB  primary  ext3        boot
 +
  2      107MB  8694MB  8587MB  primary  ext3             
 +
  3      8694MB  17.3GB  8587MB  primary  ext3             
 +
  4      17.3GB  73.4GB  56.1GB extended                   
 +
  5      17.3GB 25.9GB  8587MB logical  linux-swap       
 +
  6      25.9GB  30.2GB  4294MB  logical  ext3             
 +
  7      30.2GB  73.4GB  43.2GB  logical  ext3
  
 
1. Make sure that 'mdadm' is installed:
 
1. Make sure that 'mdadm' is installed:
Line 115: Line 162:
 
   yum install mdadm
 
   yum install mdadm
  
2. Copy the partition table from one disk to the other
+
2. Load the RAID kernel modules
 +
 
 +
  modprobe linear
 +
  modprobe multipath
 +
  modprobe raid0
 +
  modprobe raid1
 +
  modprobe raid5
 +
  modprobe raid6
 +
  modprobe raid10
 +
 
 +
3. If the new disk 2 has previously been used in a server with RAID, you will need to stop and remove any meta devices that it had
 +
 
 +
  mdadm --stop /dev/md0
 +
  mdadm --stop /dev/md1
 +
  mdadm --stop /dev/md2
 +
  mdadm --stop /dev/md3
 +
  mdadm --stop /dev/md4
 +
  mdadm --stop /dev/md5
 +
  mdadm --remove /dev/md0
 +
  mdadm --remove /dev/md1
 +
  mdadm --remove /dev/md2
 +
  mdadm --remove /dev/md3
 +
  mdadm --remove /dev/md4
 +
  mdadm --remove /dev/md5
 +
 
 +
4. Copy the partition table from one disk to the other
  
 
   sfdisk -d /dev/sda | sfdisk /dev/sdb
 
   sfdisk -d /dev/sda | sfdisk /dev/sdb
  
3. Change the partition type of the second disk to 'raid':
+
5. Change the partition type of the second disk to 'raid':
  
 
   fdisk /dev/sdb
 
   fdisk /dev/sdb
Line 142: Line 214:
 
   Command (m for help): '''w'''
 
   Command (m for help): '''w'''
  
4. Create the configuration file /etc/mdadm.conf
+
The output from the parted command after this should look like:
  
   vi /etc/mdadm.conf
+
   parted -l
 +
 
 +
  Model: FUJITSU MAV2073RCSUN72G (scsi)
 +
  Disk /dev/sdb: 73.4GB
 +
  Sector size (logical/physical): 512B/512B
 +
  Partition Table: msdos
 +
 
 +
  Number  Start  End    Size    Type      File system  Flags   
 +
  1      32.3kB  107MB  107MB  primary  ext3        boot, raid
 +
  2      107MB  8694MB  8587MB  primary  ext3        raid     
 +
  3      8694MB  17.3GB  8587MB  primary  ext3        raid     
 +
  4      17.3GB  73.4GB  56.1GB  extended                       
 +
  5      17.3GB  25.9GB  8587MB  logical  linux-swap  raid     
 +
  6      25.9GB  30.2GB  4294MB  logical  ext3        raid     
 +
  7      30.2GB  73.4GB  43.2GB  logical  ext3        raid
  
with the following contents:
+
6. To make sure that there are no remains from previous RAID installations on /dev/sdb, we run the following commands:
  
   DEVICE /dev/sda* /dev/sdb*
+
   mdadm --zero-superblock /dev/sdb1
 +
  mdadm --zero-superblock /dev/sdb2
 +
  mdadm --zero-superblock /dev/sdb3
 +
  mdadm --zero-superblock /dev/sdb5
 +
  mdadm --zero-superblock /dev/sdb6
 +
  mdadm --zero-superblock /dev/sdb7
  
5. Build the RAID devices using the 'mdadm' command:
+
If there was nothing to zero, you'll get this response:
  
   mdadm -C /dev/md1 -n 2 -l 1 missing /dev/sdb1
+
  mdadm: Unrecognised md component device - /dev/sdb1
   mdadm -C /dev/md2 -n 2 -l 1 missing /dev/sdb2
+
 
   .
+
All good!
   .
+
 
   .
+
7. Build the RAID devices using the 'mdadm' command:
 +
 
 +
  mdadm -C /dev/md0 -n 2 -l 1 missing /dev/sdb1
 +
   mdadm -C /dev/md1 -n 2 -l 1 missing /dev/sdb2
 +
   mdadm -C /dev/md2 -n 2 -l 1 missing /dev/sdb3
 +
   mdadm -C /dev/md3 -n 2 -l 1 missing /dev/sdb5
 +
   mdadm -C /dev/md4 -n 2 -l 1 missing /dev/sdb6
 +
   mdadm -C /dev/md5 -n 2 -l 1 missing /dev/sdb7
  
Repeat for all partitions, then check your work:
+
Check your work:
  
 
   cat /proc/mdstat
 
   cat /proc/mdstat
  
6. Update /etc/mdadm.conf
+
which should give you something like
 +
 
 +
  cat /proc/mdstat
 +
  Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
 +
  md5 : active raid1 sdb7[1]
 +
        42226752 blocks [2/1] [_U]
 +
   
 +
  md4 : active raid1 sdb6[1]
 +
        4192832 blocks [2/1] [_U]
 +
     
 +
  md3 : active raid1 sdb5[1]
 +
        8385792 blocks [2/1] [_U]
 +
     
 +
  md2 : active raid1 sdb3[1]
 +
        8385856 blocks [2/1] [_U]
 +
     
 +
  md1 : active raid1 sdb2[1]
 +
        8385856 blocks [2/1] [_U]
 +
     
 +
  md0 : active raid1 sdb1[1]
 +
        104320 blocks [2/1] [_U]
 +
     
 +
  unused devices: <none>
  
  mdadm --detail --scan >> /etc/mdadm.conf
 
  
7. Create file a file system on each partition
+
8. Create file a file system on each partition
  
 +
  mkfs.ext3 -j /dev/md0
 
   mkfs.ext3 -j /dev/md1
 
   mkfs.ext3 -j /dev/md1
 
   mkfs.ext3 -j /dev/md2
 
   mkfs.ext3 -j /dev/md2
   .
+
   mkfs.ext3 -j /dev/md4
   .
+
   mkfs.ext3 -j /dev/md5
   mkswap /dev/md5
+
   mkswap /dev/md3
 +
 
 +
9. Update /etc/mdadm.conf
 +
 
 +
  mdadm --detail --scan > /etc/mdadm.conf
  
8. Make a mount point and mount all the partitions:
+
10. Make a mount point and mount all the partitions:
  
 
   mkdir /mnt/md
 
   mkdir /mnt/md
   mount /dev/md6 /mnt/md
+
   mount /dev/md4 /mnt/md
  mkdir /mnt/md/boot
 
  mkdir /mnt/md/usr
 
  mkdir /mnt/md/var
 
  mkdir /mnt/md/home
 
 
   cd /mnt
 
   cd /mnt
 +
  mkdir md/boot
 +
  mkdir md/var
 +
  mkdir md/usr
 +
  mkdir md/home
 
   chmod -R 755 md/
 
   chmod -R 755 md/
   mount /dev/md1 /mnt/md/boot
+
   mount /dev/md0 /mnt/md/boot
   mount /dev/md3 /mnt/md/usr
+
   mount /dev/md1 /mnt/md/home
   mount /dev/md7 /mnt/md/var
+
   mount /dev/md2 /mnt/md/usr
   mount /dev/md2 /mnt/md/home
+
   mount /dev/md5 /mnt/md/var
 
 
9. Copy the files from disk 1 to the newly built raid
 
 
 
  rsync -aXv ––exclude=/proc/* ––exclude=/sys/* ––exclude=/dev/* ––exclude=/mnt/md / /mnt/md
 
 
 
NOTE: Remember to escape the * if you're using a shell like tcsh.
 
  
10. Bind /proc, /sys & /dev to the new filesystem
+
11. Edit /etc/fstab to use the appropriate metadevices.
 
 
  mount -o bind /proc /mnt/md/proc
 
  mount -o bind /sys /mnt/md/sys
 
  mount -o bind /dev /mnt/md/dev
 
  mount -o bind /dev/pts /mnt/md/dev/pts
 
 
 
11. Change the root device
 
 
 
  chroot /mnt/md
 
 
 
12. Edit /etc/fstab to use the appropriate metadevices.
 
  
 
   LABEL=/                /                      ext3    defaults        1 1
 
   LABEL=/                /                      ext3    defaults        1 1
Line 221: Line 328:
 
becomes
 
becomes
  
   /dev/md6               /                      ext3    defaults        1 1
+
   /dev/md4               /                      ext3    defaults        1 1
   /dev/md7               /var                    ext3    defaults        1 2
+
   /dev/md5               /var                    ext3    defaults        1 2
   /dev/md3               /usr                    ext3    defaults        1 2
+
   /dev/md2               /usr                    ext3    defaults        1 2
   /dev/md2               /home                  ext3    defaults        1 2
+
   /dev/md1               /home                  ext3    defaults        1 2
   /dev/md1               /boot                  ext3    defaults        1 2
+
   /dev/md0               /boot                  ext3    defaults        1 2
 
   tmpfs                  /dev/shm                tmpfs  defaults        0 0
 
   tmpfs                  /dev/shm                tmpfs  defaults        0 0
 
   devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
 
   devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
 
   sysfs                  /sys                    sysfs  defaults        0 0
 
   sysfs                  /sys                    sysfs  defaults        0 0
 
   proc                    /proc                  proc    defaults        0 0
 
   proc                    /proc                  proc    defaults        0 0
   /dev/md5               swap                    swap    defaults        0 0
+
   /dev/md3               swap                    swap    defaults        0 0
 +
 
 +
12. Edit the mtab file to reflect the changes we've made
 +
 
 +
  vim /etc/mtab
  
13. Edit /boot/grub/device.map and add the following entry for the second disk.
+
so it looks like
  
   (hd1)    /dev/sdb
+
   /dev/md4 / ext3 rw 0 0
 +
  proc /proc proc rw 0 0
 +
  sysfs /sys sysfs rw 0 0
 +
  devpts /dev/pts devpts rw,gid=5,mode=620 0 0
 +
  /dev/md5 /var ext3 rw 0 0
 +
  /dev/md2 /usr ext3 rw 0 0
 +
  /dev/md1 /home ext3 rw 0 0
 +
  /dev/md0 /boot ext3 rw 0 0
 +
  tmpfs /dev/shm tmpfs rw 0 0
 +
  none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0
 +
  sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0
  
14. Install GRUB on the second disk
+
13. Edit /boot/grub/menu.lst to allow fallback to the next kernel in the event of a boot failure:
  
   grub --device-map=/boot/grub/device.map
+
  vim /boot/grub/menu.lst
 +
 
 +
and add
 +
 
 +
  default=0
 +
  '''fallback=1'''
 +
 
 +
14. Add an entry for in menu.lst to allow booting off metadevice root
 +
 
 +
  '''title CentOS (2.6.18-194.26.1.el5 metadevices)
 +
          root (hd1,0)
 +
          kernel /vmlinuz-2.6.18-194.26.1.el5 ro root=/dev/md4
 +
          initrd /initrd-2.6.18-194.26.1.el5.img'''
 +
   title CentOS (2.6.18-194.26.1.el5)
 +
          root (hd0,0)
 +
          kernel /vmlinuz-2.6.18-194.26.1.el5 ro root=LABEL=/
 +
          initrd /initrd-2.6.18-194.26.1.el5.img
 +
 
 +
15. Rebuild ramdisk to reflect our new reality:
 +
 
 +
  mv /boot/initrd-`uname -r`.img /boot/initrd-`uname -r`.img_orig
 +
  mkinitrd /boot/initrd-`uname -r`.img `uname -r`
 +
 
 +
16. Copy the files from disk 1 to the newly built raid
 +
 
 +
  rsync -aXv ––exclude=/proc/* ––exclude=/sys/* ––exclude=/dev/* ––exclude=/mnt/md / /mnt/md
 +
 
 +
17. Setup GRUB on both disks
 +
 
 +
  grub
 +
 
 +
  Probing devices to guess BIOS drives. This may take a long time.
 
    
 
    
 
    
 
    
 
       GNU GRUB  version 0.97  (640K lower / 3072K upper memory)
 
       GNU GRUB  version 0.97  (640K lower / 3072K upper memory)
 
+
 
 
   [ Minimal BASH-like line editing is supported.  For the first word, TAB
 
   [ Minimal BASH-like line editing is supported.  For the first word, TAB
 
     lists possible command completions.  Anywhere else TAB lists the possible
 
     lists possible command completions.  Anywhere else TAB lists the possible
 
     completions of a device/filename.]
 
     completions of a device/filename.]
 +
  grub> '''root (hd0,0)'''
 +
  root (hd0,0)
 +
  Filesystem type is ext2fs, partition type 0x83
 +
  grub> '''setup (hd0)'''
 +
  setup (hd0)
 +
  Checking if "/boot/grub/stage1" exists... no
 +
  Checking if "/grub/stage1" exists... yes
 +
  Checking if "/grub/stage2" exists... yes
 +
  Checking if "/grub/e2fs_stage1_5" exists... yes
 +
  Running "embed /grub/e2fs_stage1_5 (hd0)"...  15 sectors are embedded.
 +
  succeeded
 +
  Running "install /grub/stage1 (hd0) (hd0)1+15 p (hd0,0)/grub/stage2 /grub/grub.conf"... succeeded
 +
  Done.
 
   grub> '''root (hd1,0)'''
 
   grub> '''root (hd1,0)'''
 
   root (hd1,0)
 
   root (hd1,0)
Line 261: Line 426:
 
   grub> '''quit'''
 
   grub> '''quit'''
  
15. Rebuild initramfs and force it to detect RAID, otherwise you'll get a kernel panic
+
18. Reboot and hope it works!
 +
 
 +
19. After the reboot, check that the metadevices have been used to boot from
 +
 
 +
  df -h
  
  cd /boot
+
should show:
  mkinitrd -f --force-raid-probe initrd-`uname -r`.img `uname -r`
 
  
16. Edit /boot/grub/grub.conf to boot of second disk and use the metadeviceChange grub.cfg from:
+
  Filesystem            Size  Used Avail Use% Mounted on
 +
  /dev/md4              4.0G  286M  3.5G  8% /
 +
  /dev/md5              40G  209M  38G  1% /var
 +
  /dev/md2              7.9G  954M  6.6G  13% /usr
 +
  /dev/md1              7.9G  147M 7.4G  2% /home
 +
  /dev/md0              99M  15M  80M  16% /boot
 +
  tmpfs                1006M    0 1006M  0% /dev/shm
  
   splashimage=(hd0,0)/grub/splash.xpm.gz
+
20. Run fdisk on the original disk to change all the partitions to RAID volumes
  hiddenmenu
+
 
   title CentOS (2.6.18-194.el5)
+
  fdisk /dev/sda
 +
 
 +
  The number of cylinders for this disk is set to 8924.
 +
  There is nothing wrong with that, but this is larger than 1024,
 +
  and could in certain setups cause problems with:
 +
  1) software that runs at boot time (e.g., old versions of LILO)
 +
  2) booting and partitioning software from other OSs
 +
    (e.g., DOS FDISK, OS/2 FDISK)
 +
 
 +
  Command (m for help): t
 +
  Partition number (1-7): 1
 +
  Hex code (type L to list codes): fd
 +
  Changed system type of partition 1 to fd (Linux raid autodetect)
 +
 
 +
  Command (m for help): t
 +
  Partition number (1-7): 2
 +
  Hex code (type L to list codes): fd
 +
  Changed system type of partition 2 to fd (Linux raid autodetect)
 +
  .
 +
  .
 +
  .
 +
  Command (m for help): w
 +
  The partition table has been altered!
 +
 
 +
  Calling ioctl() to re-read partition table.
 +
  Syncing disks.
 +
 
 +
21. Add the disk 1 partitions to the meta devices
 +
 
 +
  mdadm --add /dev/md0 /dev/sda1
 +
  mdadm --add /dev/md1 /dev/sda2
 +
  mdadm --add /dev/md2 /dev/sda3
 +
  mdadm --add /dev/md3 /dev/sda5
 +
  mdadm --add /dev/md4 /dev/sda6
 +
  mdadm --add /dev/md5 /dev/sda7
 +
 
 +
22. Monitor the re-sync with
 +
 
 +
  cat /proc/mdstat
 +
 
 +
until all meta devices have, for want of a better word, synced.
 +
 
 +
23. Update /etc/mdadm.conf to reflect the new situation
 +
 
 +
  mdadm --examine --scan > /etc/mdadm.conf
 +
 
 +
which should result in mdadm.conf looking something like:
 +
 
 +
  ARRAY /dev/md0 level=raid1 num-devices=2 UUID=98b0f3a8:e65cf4ba:78362d52:4149fec6
 +
  ARRAY /dev/md1 level=raid1 num-devices=2 UUID=ad326160:3c07ea40:2743f05c:01d0ba7d
 +
   ARRAY /dev/md2 level=raid1 num-devices=2 UUID=90db8ba9:a813a3b1:5773fe4e:92575c88
 +
  ARRAY /dev/md3 level=raid1 num-devices=2 UUID=1b3b7297:80b9ffc4:9bf9523a:b9249fe4
 +
  ARRAY /dev/md4 level=raid1 num-devices=2 UUID=7a1f48f9:847a1414:07e67ace:0f90e882
 +
  ARRAY /dev/md5 level=raid1 num-devices=2 UUID=76f18601:8d99c321:b2af7455:316847af
 +
 
 +
24. Update grub so that in the event that disk 2 should fail, it will boot off disk 1
 +
 
 +
  vim /boot/grub/menu.lst
 +
 
 +
so that it looks like this:
 +
 
 +
  title CentOS (2.6.18-194.26.1.el5 Disk 1)
 +
          root (hd1,0)
 +
          kernel /vmlinuz-2.6.18-194.26.1.el5 ro root=/dev/md4
 +
          initrd /initrd-2.6.18-194.26.1.el5.img
 +
   '''title CentOS (2.6.18-194.26.1.el5 Disk 0)
 
           root (hd0,0)
 
           root (hd0,0)
           kernel /vmlinuz-2.6.18-194.el5 ro root=LABEL=/ console=tty0 console=ttyS0,9600
+
           kernel /vmlinuz-2.6.18-194.26.1.el5 ro root=/dev/md4
           initrd /initrd-2.6.18-194.el5.img
+
           initrd /initrd-2.6.18-194.26.1.el5.img'''
 
 
to:
 
  
  splashimage=(hd'''1''',0)/grub/splash.xpm.gz
+
Every time you're kernel is updated, you'll need to do this again!
  hiddenmenu
 
  title CentOS (2.6.18-194.el5)
 
          root (hd'''1''',0)
 
          kernel /vmlinuz-2.6.18-194.el5 ro root='''/dev/md6''' console=tty0 console=ttyS0,9600
 
          initrd /initrd-2.6.18-194.el5.img
 
  
17. Exit back to the default environment and unmount all the meta device file systems
+
25. Now finally update the ramdisk one more time
  
   exit
+
   mv /boot/initrd-`uname -r`.img /boot/initrd-`uname -r`.img_orig2
  umount /mnt/md/dev/pts
+
   mkinitrd /boot/initrd-`uname -r`.img `uname -r`
  umount /mnt/md/dev
 
  umount /mnt/md/sys
 
  umount /mnt/md/proc
 
   umount /mnt/md/boot
 
  umount /mnt/md/usr
 
  umount /mnt/md/var
 
  umount /mnt/md/home
 
  umount /mnt/md
 
  
18. Reboot the machine and go directly into the BIOS.
+
26. Reboot and you should be good to go!

Latest revision as of 01:27, 18 June 2012

Solaris:

In Solaris 10, you can mirror the disks during the build process with Jumpstart, or even use ZFS mirroring. However, on older systems, you need to mirror the disks the old fashioned way with DiskSuite 4.2.1. In this example, the file names for the two disks being mirrored are c1t0d0sX and c1t1d0sX. The disks are partitioned like this:

  /         c1t0d0s0
  swap      c1t0d0s1
  /var      c1t0d0s3
  /usr      c1t0d0s4
  /opt      c1t0d0s5
  /export   c1t0d0s6
  -         c1t0d0s7 

Slice 7 is a 50mb partition used to store the metadatabaes! IMO, it is best practice to never use slice 7, except for this purpose!

1. You need to make sure the following packages are installed first:

 SUNWmdr SUNWmdu SUNWmdx

This is the least you need to install! Oh and they will need to bzip to install successfully! They can be found on "Solaris 8 Software CD 2" (not sure for Solaris 9, but they could be part of the standard disto), in the following location:

 cd $PATH_TO_CDROM/Solaris_8/EA/products/DiskSuite_4.2.1/sparc/Packages
 pkgadd -d . SUNWmdr SUNWmdu SUNWmdx

Once they are installed, reboot the box. It wouldn't hurt to do an 'init 0' and then a 'boot -r'!

2. Copy the primary disks VTOC to the secondary disk:

  prtvtoc /dev/rdsk/c1t0d0s2 | fmthard -s - /dev/rdsk/c1t1d0s2

3. Create the metadatabases

  metadb -f -a -c3 /dev/dsk/c1t0d0s7
  metadb -f -a -c3 /dev/dsk/c1t1d0s7

4. Create and initialise the metadevices on the primary disk

  metainit -f d10 1 1 c1t0d0s0
  metainit -f d11 1 1 c1t0d0s1
  metainit -f d13 1 1 c1t0d0s3
  metainit -f d14 1 1 c1t0d0s4
  metainit -f d15 1 1 c1t0d0s5
  metainit -f d16 1 1 c1t0d0s6
  metainit d0 -m d10
  metainit d1 -m d11
  metainit d3 -m d13
  metainit d4 -m d14
  metainit d5 -m d15
  metainit d6 -m d16

5. Assign the root metadevice

  metaroot d0

6. Edit /etc/vfstab and make sure that all the devices are changed to metadevices. So each line that looked like this

  /dev/dsk/c1t0d0s3 /dev/rdsk/c1t0d0s3  /var  ufs   1 no    -

now looks like

  /dev/md/dsk/d3 /dev/md/rdsk/d3  /var  ufs   1 no    -

7. Run the lockfs command and reboot

  lockfs -fa
  reboot

8. Create the the metadevices on the secondary disk

  metainit d20 1 1 c1t1d0s0
  metainit d21 1 1 c1t1d0s1
  metainit d23 1 1 c1t1d0s3
  metainit d24 1 1 c1t1d0s4
  metainit d25 1 1 c1t1d0s5
  metainit d26 1 1 c1t1d0s6 

9. Attach the metadevices from the secondary disk to the metadevices we booted from to start the mirroring process

  metattach d0 d20
  metattach d1 d21
  metattach d3 d23
  metattach d4 d24
  metattach d5 d25
  metattach d6 d26

You can monitor the progress of mirroring with this command:

  metastat | grep -i progress

When it returns nothing, it's all finished.

10. Install the boot block on the secondary hard disk so that you can boot off it

  installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c1t1d0s0

NOTE: If you get this error with the installboot command

 dd: unrecognized operand `oseek'=`1' Try `dd --help' for more information.

then it is using the wrong version of 'dd'. Change your PATH to put /usr/bin/dd first in your PATH.

11. Disable the following in /etc/inetd.conf

  100229/1 (rpc.metad)
  100230/1 (rpc.metamhd)

12. Add this line to /etc/system so that the machine will boot if there are only 50% of the metadatabases available, i.e. a disk has failed

  set md:mirrored_root_flag=1


You're now finished :)

CentOS/RedHat

OK, so software RAID on CentOS/RedHat sucks! Unless you build the box from the outset with mirrored disk, you're going to need to do a lot of work to get to a point where you're going to be happy. Wherever possible, mirror the disks during build! However, if the box is already built and you need to mirror the boot disk after the fact, here's how you do it. I have tested this on CentOS 5.5.

This is a worked example that uses the following disk layout:

 df -h 
 
 Filesystem            Size  Used Avail Use% Mounted on
 /dev/sda6             3.9G  283M  3.4G   8% /
 /dev/sda7              40G  251M   37G   1% /var
 /dev/sda3             7.8G  764M  6.7G  11% /usr
 /dev/sda2             7.8G  147M  7.3G   2% /home
 /dev/sda1              99M   12M   82M  13% /boot
 tmpfs                1006M     0 1006M   0% /dev/shm
 fdisk -l
 
 Disk /dev/sda: 73.4 GB, 73407865856 bytes
 255 heads, 63 sectors/track, 8924 cylinders
 Units = cylinders of 16065 * 512 = 8225280 bytes
 
    Device Boot      Start         End      Blocks   Id  System
 /dev/sda1   *           1          13      104391   83  Linux
 /dev/sda2              14        1057     8385930   83  Linux
 /dev/sda3            1058        2101     8385930   83  Linux
 /dev/sda4            2102        8924    54805747+   5  Extended
 /dev/sda5            2102        3145     8385898+  82  Linux swap / Solaris
 /dev/sda6            3146        3667     4192933+  83  Linux
 /dev/sda7            3668        8924    42226821   83  Linux
 parted -l
 
 Model: FUJITSU MAV2073RCSUN72G (scsi)
 Disk /dev/sda: 73.4GB
 Sector size (logical/physical): 512B/512B
 Partition Table: msdos
 
 Number  Start   End     Size    Type      File system  Flags
  1      32.3kB  107MB   107MB   primary   ext3         boot 
  2      107MB   8694MB  8587MB  primary   ext3              
  3      8694MB  17.3GB  8587MB  primary   ext3              
  4      17.3GB  73.4GB  56.1GB  extended                    
  5      17.3GB  25.9GB  8587MB  logical   linux-swap        
  6      25.9GB  30.2GB  4294MB  logical   ext3              
  7      30.2GB  73.4GB  43.2GB  logical   ext3 

1. Make sure that 'mdadm' is installed:

 yum install mdadm

2. Load the RAID kernel modules

 modprobe linear
 modprobe multipath
 modprobe raid0
 modprobe raid1
 modprobe raid5
 modprobe raid6
 modprobe raid10

3. If the new disk 2 has previously been used in a server with RAID, you will need to stop and remove any meta devices that it had

 mdadm --stop /dev/md0
 mdadm --stop /dev/md1
 mdadm --stop /dev/md2
 mdadm --stop /dev/md3
 mdadm --stop /dev/md4
 mdadm --stop /dev/md5
 mdadm --remove /dev/md0
 mdadm --remove /dev/md1
 mdadm --remove /dev/md2
 mdadm --remove /dev/md3
 mdadm --remove /dev/md4
 mdadm --remove /dev/md5

4. Copy the partition table from one disk to the other

 sfdisk -d /dev/sda | sfdisk /dev/sdb

5. Change the partition type of the second disk to 'raid':

 fdisk /dev/sdb
 
 The number of cylinders for this disk is set to 8924.
 There is nothing wrong with that, but this is larger than 1024,
 and could in certain setups cause problems with:
 1) software that runs at boot time (e.g., old versions of LILO)
 2) booting and partitioning software from other OSs
    (e.g., DOS FDISK, OS/2 FDISK)
 
 Command (m for help): t
 Partition number (1-7): 1
 Hex code (type L to list codes): fd
 
 Command (m for help): t
 Partition number (1-7): 2
 Hex code (type L to list codes): fd

Etc. Repeat for all partitions and then exit:

 Command (m for help): w

The output from the parted command after this should look like:

 parted -l
 
 Model: FUJITSU MAV2073RCSUN72G (scsi)
 Disk /dev/sdb: 73.4GB
 Sector size (logical/physical): 512B/512B
 Partition Table: msdos
 
 Number  Start   End     Size    Type      File system  Flags     
  1      32.3kB  107MB   107MB   primary   ext3         boot, raid
  2      107MB   8694MB  8587MB  primary   ext3         raid      
  3      8694MB  17.3GB  8587MB  primary   ext3         raid      
  4      17.3GB  73.4GB  56.1GB  extended                         
  5      17.3GB  25.9GB  8587MB  logical   linux-swap   raid      
  6      25.9GB  30.2GB  4294MB  logical   ext3         raid      
  7      30.2GB  73.4GB  43.2GB  logical   ext3         raid 

6. To make sure that there are no remains from previous RAID installations on /dev/sdb, we run the following commands:

 mdadm --zero-superblock /dev/sdb1
 mdadm --zero-superblock /dev/sdb2
 mdadm --zero-superblock /dev/sdb3
 mdadm --zero-superblock /dev/sdb5
 mdadm --zero-superblock /dev/sdb6
 mdadm --zero-superblock /dev/sdb7

If there was nothing to zero, you'll get this response:

 mdadm: Unrecognised md component device - /dev/sdb1

All good!

7. Build the RAID devices using the 'mdadm' command:

 mdadm -C /dev/md0 -n 2 -l 1 missing /dev/sdb1
 mdadm -C /dev/md1 -n 2 -l 1 missing /dev/sdb2
 mdadm -C /dev/md2 -n 2 -l 1 missing /dev/sdb3
 mdadm -C /dev/md3 -n 2 -l 1 missing /dev/sdb5
 mdadm -C /dev/md4 -n 2 -l 1 missing /dev/sdb6
 mdadm -C /dev/md5 -n 2 -l 1 missing /dev/sdb7

Check your work:

 cat /proc/mdstat

which should give you something like

 cat /proc/mdstat
 Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
 md5 : active raid1 sdb7[1]
       42226752 blocks [2/1] [_U]
   
 md4 : active raid1 sdb6[1]
       4192832 blocks [2/1] [_U]
     
 md3 : active raid1 sdb5[1]
       8385792 blocks [2/1] [_U]
     
 md2 : active raid1 sdb3[1]
       8385856 blocks [2/1] [_U]
     
 md1 : active raid1 sdb2[1]
       8385856 blocks [2/1] [_U]
     
 md0 : active raid1 sdb1[1]
       104320 blocks [2/1] [_U]
     
 unused devices: <none>


8. Create file a file system on each partition

 mkfs.ext3 -j /dev/md0
 mkfs.ext3 -j /dev/md1
 mkfs.ext3 -j /dev/md2
 mkfs.ext3 -j /dev/md4
 mkfs.ext3 -j /dev/md5
 mkswap /dev/md3

9. Update /etc/mdadm.conf

 mdadm --detail --scan > /etc/mdadm.conf 

10. Make a mount point and mount all the partitions:

 mkdir /mnt/md
 mount /dev/md4 /mnt/md
 cd /mnt
 mkdir md/boot
 mkdir md/var
 mkdir md/usr
 mkdir md/home
 chmod -R 755 md/
 mount /dev/md0 /mnt/md/boot
 mount /dev/md1 /mnt/md/home
 mount /dev/md2 /mnt/md/usr
 mount /dev/md5 /mnt/md/var

11. Edit /etc/fstab to use the appropriate metadevices.

 LABEL=/                 /                       ext3    defaults        1 1
 LABEL=/var              /var                    ext3    defaults        1 2
 LABEL=/usr              /usr                    ext3    defaults        1 2
 LABEL=/home             /home                   ext3    defaults        1 2
 LABEL=/boot             /boot                   ext3    defaults        1 2
 tmpfs                   /dev/shm                tmpfs   defaults        0 0
 devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
 sysfs                   /sys                    sysfs   defaults        0 0
 proc                    /proc                   proc    defaults        0 0
 LABEL=SWAP-sda5         swap                    swap    defaults        0 0
 

becomes

 /dev/md4                /                       ext3    defaults        1 1
 /dev/md5                /var                    ext3    defaults        1 2
 /dev/md2                /usr                    ext3    defaults        1 2
 /dev/md1                /home                   ext3    defaults        1 2
 /dev/md0                /boot                   ext3    defaults        1 2
 tmpfs                   /dev/shm                tmpfs   defaults        0 0
 devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
 sysfs                   /sys                    sysfs   defaults        0 0
 proc                    /proc                   proc    defaults        0 0
 /dev/md3                swap                    swap    defaults        0 0

12. Edit the mtab file to reflect the changes we've made

  vim /etc/mtab

so it looks like

 /dev/md4 / ext3 rw 0 0
 proc /proc proc rw 0 0
 sysfs /sys sysfs rw 0 0
 devpts /dev/pts devpts rw,gid=5,mode=620 0 0
 /dev/md5 /var ext3 rw 0 0
 /dev/md2 /usr ext3 rw 0 0
 /dev/md1 /home ext3 rw 0 0
 /dev/md0 /boot ext3 rw 0 0
 tmpfs /dev/shm tmpfs rw 0 0
 none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0
 sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0

13. Edit /boot/grub/menu.lst to allow fallback to the next kernel in the event of a boot failure:

  vim /boot/grub/menu.lst

and add

 default=0
 fallback=1

14. Add an entry for in menu.lst to allow booting off metadevice root

 title CentOS (2.6.18-194.26.1.el5 metadevices)
         root (hd1,0)
         kernel /vmlinuz-2.6.18-194.26.1.el5 ro root=/dev/md4
         initrd /initrd-2.6.18-194.26.1.el5.img
 title CentOS (2.6.18-194.26.1.el5)
         root (hd0,0)
         kernel /vmlinuz-2.6.18-194.26.1.el5 ro root=LABEL=/
         initrd /initrd-2.6.18-194.26.1.el5.img

15. Rebuild ramdisk to reflect our new reality:

 mv /boot/initrd-`uname -r`.img /boot/initrd-`uname -r`.img_orig
 mkinitrd /boot/initrd-`uname -r`.img `uname -r`

16. Copy the files from disk 1 to the newly built raid

 rsync -aXv ––exclude=/proc/* ––exclude=/sys/* ––exclude=/dev/* ––exclude=/mnt/md / /mnt/md

17. Setup GRUB on both disks

 grub
 Probing devices to guess BIOS drives. This may take a long time.
 
 
     GNU GRUB  version 0.97  (640K lower / 3072K upper memory)
 
  [ Minimal BASH-like line editing is supported.  For the first word, TAB
    lists possible command completions.  Anywhere else TAB lists the possible
    completions of a device/filename.]
 grub> root (hd0,0)
 root (hd0,0)
  Filesystem type is ext2fs, partition type 0x83
 grub> setup (hd0)
 setup (hd0)
  Checking if "/boot/grub/stage1" exists... no
  Checking if "/grub/stage1" exists... yes
  Checking if "/grub/stage2" exists... yes
  Checking if "/grub/e2fs_stage1_5" exists... yes
  Running "embed /grub/e2fs_stage1_5 (hd0)"...  15 sectors are embedded.
 succeeded
  Running "install /grub/stage1 (hd0) (hd0)1+15 p (hd0,0)/grub/stage2 /grub/grub.conf"... succeeded
 Done.
 grub> root (hd1,0)
 root (hd1,0)
  Filesystem type is ext2fs, partition type 0xfd
 grub> setup (hd1)
 setup (hd1)
  Checking if "/boot/grub/stage1" exists... no
  Checking if "/grub/stage1" exists... yes
  Checking if "/grub/stage2" exists... yes
  Checking if "/grub/e2fs_stage1_5" exists... yes
  Running "embed /grub/e2fs_stage1_5 (hd1)"...  15 sectors are embedded.
 succeeded
  Running "install /grub/stage1 (hd1) (hd1)1+15 p (hd1,0)/grub/stage2 /grub/grub.conf"... succeeded
 Done.
 grub> quit

18. Reboot and hope it works!

19. After the reboot, check that the metadevices have been used to boot from

 df -h

should show:

 Filesystem            Size  Used Avail Use% Mounted on
 /dev/md4              4.0G  286M  3.5G   8% /
 /dev/md5               40G  209M   38G   1% /var
 /dev/md2              7.9G  954M  6.6G  13% /usr
 /dev/md1              7.9G  147M  7.4G   2% /home
 /dev/md0               99M   15M   80M  16% /boot
 tmpfs                1006M     0 1006M   0% /dev/shm

20. Run fdisk on the original disk to change all the partitions to RAID volumes

 fdisk /dev/sda
 The number of cylinders for this disk is set to 8924.
 There is nothing wrong with that, but this is larger than 1024,
 and could in certain setups cause problems with:
 1) software that runs at boot time (e.g., old versions of LILO)
 2) booting and partitioning software from other OSs
    (e.g., DOS FDISK, OS/2 FDISK)
 
 Command (m for help): t
 Partition number (1-7): 1
 Hex code (type L to list codes): fd
 Changed system type of partition 1 to fd (Linux raid autodetect)
 
 Command (m for help): t
 Partition number (1-7): 2
 Hex code (type L to list codes): fd
 Changed system type of partition 2 to fd (Linux raid autodetect)
 .
 .
 .
 Command (m for help): w
 The partition table has been altered!
 
 Calling ioctl() to re-read partition table.
 Syncing disks.

21. Add the disk 1 partitions to the meta devices

 mdadm --add /dev/md0 /dev/sda1
 mdadm --add /dev/md1 /dev/sda2
 mdadm --add /dev/md2 /dev/sda3
 mdadm --add /dev/md3 /dev/sda5
 mdadm --add /dev/md4 /dev/sda6
 mdadm --add /dev/md5 /dev/sda7

22. Monitor the re-sync with

 cat /proc/mdstat

until all meta devices have, for want of a better word, synced.

23. Update /etc/mdadm.conf to reflect the new situation

 mdadm --examine --scan > /etc/mdadm.conf

which should result in mdadm.conf looking something like:

 ARRAY /dev/md0 level=raid1 num-devices=2 UUID=98b0f3a8:e65cf4ba:78362d52:4149fec6
 ARRAY /dev/md1 level=raid1 num-devices=2 UUID=ad326160:3c07ea40:2743f05c:01d0ba7d
 ARRAY /dev/md2 level=raid1 num-devices=2 UUID=90db8ba9:a813a3b1:5773fe4e:92575c88
 ARRAY /dev/md3 level=raid1 num-devices=2 UUID=1b3b7297:80b9ffc4:9bf9523a:b9249fe4
 ARRAY /dev/md4 level=raid1 num-devices=2 UUID=7a1f48f9:847a1414:07e67ace:0f90e882
 ARRAY /dev/md5 level=raid1 num-devices=2 UUID=76f18601:8d99c321:b2af7455:316847af

24. Update grub so that in the event that disk 2 should fail, it will boot off disk 1

 vim /boot/grub/menu.lst

so that it looks like this:

 title CentOS (2.6.18-194.26.1.el5 Disk 1)
         root (hd1,0)
         kernel /vmlinuz-2.6.18-194.26.1.el5 ro root=/dev/md4
         initrd /initrd-2.6.18-194.26.1.el5.img
 title CentOS (2.6.18-194.26.1.el5 Disk 0)
         root (hd0,0)
         kernel /vmlinuz-2.6.18-194.26.1.el5 ro root=/dev/md4
         initrd /initrd-2.6.18-194.26.1.el5.img

Every time you're kernel is updated, you'll need to do this again!

25. Now finally update the ramdisk one more time

 mv /boot/initrd-`uname -r`.img /boot/initrd-`uname -r`.img_orig2
 mkinitrd /boot/initrd-`uname -r`.img `uname -r`

26. Reboot and you should be good to go!