Mirroring Disks

From Peter Pap's Technowiki
Revision as of 00:55, 15 November 2010 by Ppapa (talk | contribs) (Solaris:)

Jump to: navigation, search

Solaris:

In Solaris 10, you can mirror the disks during the build process with Jumpstart, or even use ZFS mirroring. However, on older systems, you need to mirror the disks the old fashioned way with DiskSuite 4.2.1. In this example, the file names for the two disks being mirrored are c1t0d0sX and c1t1d0sX. The disks are partitioned like this:

  /         c1t0d0s0
  swap      c1t0d0s1
  /var      c1t0d0s3
  /usr      c1t0d0s4
  /opt      c1t0d0s5
  /export   c1t0d0s6
  -         c1t0d0s7 

Slice 7 is a 50mb partition used to store the metadatabaes! IMO, it is best practice to never use slice 7, except for this purpose!

1. You need to make sure the following packages are installed first:

 SUNWmdr SUNWmdu SUNWmdx

This is the least you need to install! Oh and they will need to bzip to install successfully! They can be found on "Solaris 8 Software CD 2" (not sure for Solaris 9, but they could be part of the standard disto), in the following location:

 cd $PATH_TO_CDROM/Solaris_8/EA/products/DiskSuite_4.2.1/sparc/Packages
 pkgadd -d . SUNWmdr SUNWmdu SUNWmdx

Once they are installed, reboot the box. It wouldn't hurt to do an 'init 0' and then a 'boot -r'!

2. Copy the primary disks VTOC to the secondary disk:

  prtvtoc /dev/rdsk/c1t0d0s2 | fmthard -s - /dev/rdsk/c1t1d0s2

3. Create the metadatabases

  metadb -f -a -c3 /dev/dsk/c1t0d0s7
  metadb -f -a -c3 /dev/dsk/c1t1d0s7

4. Create and initialise the metadevices on the primary disk

  metainit -f d10 1 1 c1t0d0s0
  metainit -f d10 1 1 c1t0d0s1
  metainit -f d10 1 1 c1t0d0s3
  metainit -f d10 1 1 c1t0d0s4
  metainit -f d10 1 1 c1t0d0s5
  metainit -f d10 1 1 c1t0d0s6
  metainit d0 -m d10
  metainit d1 -m d11
  metainit d3 -m d13
  metainit d4 -m d14
  metainit d5 -m d15
  metainit d6 -m d16

5. Assign the root metadevice

  metaroot d0

6. Edit /etc/vfstab and make sure that all the devices are changed to metadevices. So each line that looked like this

  /dev/dsk/c1t0d0s3 /dev/rdsk/c1t0d0s3  /var  ufs   1 no    -

now looks like

  /dev/md/dsk/d3 /dev/md/rdsk/d3  /var  ufs   1 no    -

7. Run the lockfs command and reboot

  lockfs -fa
  reboot

8. Create the the metadevices on the secondary disk

  metainit d20 1 1 c1t1d0s0
  metainit d21 1 1 c1t1d0s1
  metainit d23 1 1 c1t1d0s3
  metainit d24 1 1 c1t1d0s4
  metainit d25 1 1 c1t1d0s5
  metainit d26 1 1 c1t1d0s6 

9. Attach the metadevices from the secondary disk to the metadevices we booted from to start the mirroring process

  metattach d0 d20
  metattach d1 d21
  metattach d3 d23
  metattach d4 d24
  metattach d5 d25
  metattach d6 d26

You can monitor the progress of mirroring with this command:

  metastat | grep -i progress

When it returns nothing, it's all finished.

10. Install the boot block on the secondary hard disk so that you can boot off it

  installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c1t1d0s0

NOTE: If you get this error with the installboot command

 dd: unrecognized operand `oseek'=`1' Try `dd --help' for more information.

then it is using the wrong version of 'dd'. Change your PATH to put /usr/bin/dd first in your PATH.

11. Disable the following in /etc/inetd.conf

  100229/1 (rpc.metad)
  100230/1 (rpc.metamhd)

12. Add this line to /etc/system so that the machine will boot if there are only 50% of the metadatabases available, i.e. a disk has failed

  set md:mirrored_root_flag=1


You're now finished :)

CentOS/RedHat

OK, so software RAID on CentOS/RedHat sucks! Unless you build the box from the outset with mirrored disk, you're going to need to do a lot of work to get to a point where you're going to be happy. Wherever possible, mirror the disks during build! However, if the box is already built and you need to mirror the boot disk after the fact, here's how you do it. I have tested this on CentOS 5.5.

This is a worked example that uses the following disk layout:

 df -h 
 
 Filesystem            Size  Used Avail Use% Mounted on
 /dev/sda6             3.9G  283M  3.4G   8% /
 /dev/sda7              40G  251M   37G   1% /var
 /dev/sda3             7.8G  764M  6.7G  11% /usr
 /dev/sda2             7.8G  147M  7.3G   2% /home
 /dev/sda1              99M   12M   82M  13% /boot
 tmpfs                1006M     0 1006M   0% /dev/shm
 fdisk -l
 
 Disk /dev/sda: 73.4 GB, 73407865856 bytes
 255 heads, 63 sectors/track, 8924 cylinders
 Units = cylinders of 16065 * 512 = 8225280 bytes
 
    Device Boot      Start         End      Blocks   Id  System
 /dev/sda1   *           1          13      104391   83  Linux
 /dev/sda2              14        1057     8385930   83  Linux
 /dev/sda3            1058        2101     8385930   83  Linux
 /dev/sda4            2102        8924    54805747+   5  Extended
 /dev/sda5            2102        3145     8385898+  82  Linux swap / Solaris
 /dev/sda6            3146        3667     4192933+  83  Linux
 /dev/sda7            3668        8924    42226821   83  Linux
 parted -l
 
 Model: FUJITSU MAV2073RCSUN72G (scsi)
 Disk /dev/sda: 73.4GB
 Sector size (logical/physical): 512B/512BARRAY /dev/md0 level=raid1 num-devices=2 UUID=98b0f3a8:e65cf4ba:78362d52:4149fec6

ARRAY /dev/md1 level=raid1 num-devices=2 UUID=ad326160:3c07ea40:2743f05c:01d0ba7d ARRAY /dev/md2 level=raid1 num-devices=2 UUID=90db8ba9:a813a3b1:5773fe4e:92575c88 ARRAY /dev/md3 level=raid1 num-devices=2 UUID=1b3b7297:80b9ffc4:9bf9523a:b9249fe4 ARRAY /dev/md4 level=raid1 num-devices=2 UUID=7a1f48f9:847a1414:07e67ace:0f90e882 ARRAY /dev/md5 level=raid1 num-devices=2 UUID=76f18601:8d99c321:b2af7455:316847af

 Partition Table: msdos
 
 Number  Start   End     Size    Type      File system  Flags
  1      32.3kB  107MB   107MB   primary   ext3         boot 
  2      107MB   8694MB  8587MB  primary   ext3              
  3      8694MB  17.3GB  8587MB  primary   ext3              
  4      17.3GB  73.4GB  56.1GB  extended                    
  5      17.3GB  25.9GB  8587MB  logical   linux-swap        
  6      25.9GB  30.2GB  4294MB  logical   ext3              
  7      30.2GB  73.4GB  43.2GB  logical   ext3 

1. Make sure that 'mdadm' is installed:

 yum install mdadm

2. Load the RAID kernel modules

 modprobe linear
 modprobe multipath
 modprobe raid0
 modprobe raid1
 modprobe raid5
 modprobe raid6
 modprobe raid10

3. If the new disk 2 has previously been used in a server with RAID, you will need to stop and remove any meta devices that it had

 mdadm --stop /dev/md0
 mdadm --stop /dev/md1
 mdadm --stop /dev/md2
 mdadm --stop /dev/md3
 mdadm --stop /dev/md4
 mdadm --stop /dev/md5
 mdadm --remove /dev/md0
 mdadm --remove /dev/md1
 mdadm --remove /dev/md2
 mdadm --remove /dev/md3
 mdadm --remove /dev/md4
 mdadm --remove /dev/md5

4. Copy the partition table from one disk to the other

 sfdisk -d /dev/sda | sfdisk /dev/sdb

5. Change the partition type of the second disk to 'raid':

 fdisk /dev/sdb
 
 The number of cylinders for this disk is set to 8924.
 There is nothing wrong with that, but this is larger than 1024,
 and could in certain setups cause problems with:
 1) software that runs at boot time (e.g., old versions of LILO)
 2) booting and partitioning software from other OSs
    (e.g., DOS FDISK, OS/2 FDISK)
 
 Command (m for help): t
 Partition number (1-7): 1
 Hex code (type L to list codes): fd
 
 Command (m for help): t
 Partition number (1-7): 2
 Hex code (type L to list codes): fd

Etc. Repeat for all partitions and then exit:

 Command (m for help): w

The output from the parted command after this should look like:

 parted -l
 
 Model: FUJITSU MAV2073RCSUN72G (scsi)
 Disk /dev/sdb: 73.4GB
 Sector size (logical/physical): 512B/512B
 Partition Table: msdos
 
 Number  Start   End     Size    Type      File system  Flags     
  1      32.3kB  107MB   107MB   primary   ext3         boot, raid
  2      107MB   8694MB  8587MB  primary   ext3         raid      
  3      8694MB  17.3GB  8587MB  primary   ext3         raid      
  4      17.3GB  73.4GB  56.1GB  extended                         
  5      17.3GB  25.9GB  8587MB  logical   linux-swap   raid      
  6      25.9GB  30.2GB  4294MB  logical   ext3         raid      
  7      30.2GB  73.4GB  43.2GB  logical   ext3         raid 

6. To make sure that there are no remains from previous RAID installations on /dev/sdb, we run the following commands:

 mdadm --zero-superblock /dev/sdb1
 mdadm --zero-superblock /dev/sdb2
 mdadm --zero-superblock /dev/sdb3
 mdadm --zero-superblock /dev/sdb5
 mdadm --zero-superblock /dev/sdb6
 mdadm --zero-superblock /dev/sdb7

If there was nothing to zero, you'll get this response:

 mdadm: Unrecognised md component device - /dev/sdb1

All good!

7. Build the RAID devices using the 'mdadm' command:

 mdadm -C /dev/md0 -n 2 -l 1 missing /dev/sdb1
 mdadm -C /dev/md1 -n 2 -l 1 missing /dev/sdb2
 mdadm -C /dev/md2 -n 2 -l 1 missing /dev/sdb3
 mdadm -C /dev/md3 -n 2 -l 1 missing /dev/sdb5
 mdadm -C /dev/md4 -n 2 -l 1 missing /dev/sdb6
 mdadm -C /dev/md5 -n 2 -l 1 missing /dev/sdb7

Check your work:

 cat /proc/mdstat

which should give you something like

 cat /proc/mdstat
 Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
 md5 : active raid1 sdb7[1]
       42226752 blocks [2/1] [_U]
   
 md4 : active raid1 sdb6[1]
       4192832 blocks [2/1] [_U]
     
 md3 : active raid1 sdb5[1]
       8385792 blocks [2/1] [_U]
     
 md2 : active raid1 sdb3[1]
       8385856 blocks [2/1] [_U]
     ARRAY /dev/md0 level=raid1 num-devices=2 UUID=98b0f3a8:e65cf4ba:78362d52:4149fec6

ARRAY /dev/md1 level=raid1 num-devices=2 UUID=ad326160:3c07ea40:2743f05c:01d0ba7d ARRAY /dev/md2 level=raid1 num-devices=2 UUID=90db8ba9:a813a3b1:5773fe4e:92575c88 ARRAY /dev/md3 level=raid1 num-devices=2 UUID=1b3b7297:80b9ffc4:9bf9523a:b9249fe4 ARRAY /dev/md4 level=raid1 num-devices=2 UUID=7a1f48f9:847a1414:07e67ace:0f90e882 ARRAY /dev/md5 level=raid1 num-devices=2 UUID=76f18601:8d99c321:b2af7455:316847af

 md1 : active raid1 sdb2[1]
       8385856 blocks [2/1] [_U]
     
 md0 : active raid1 sdb1[1]
       104320 blocks [2/1] [_U]
     
 unused devices: <none>


8. Create file a file system on each partition

 mkfs.ext3 -j /dev/md0
 mkfs.ext3 -j /dev/md1
 mkfs.ext3 -j /dev/md2
 mkfs.ext3 -j /dev/md4
 mkfs.ext3 -j /dev/md5
 mkswap /dev/md3

9. Update /etc/mdadm.conf

 mdadm --detail --scan > /etc/mdadm.conf 

10. Make a mount point and mount all the partitions:

 mkdir /mnt/md
 mount /dev/md4 /mnt/md
 cd /mnt
 mkdir md/boot
 mkdir md/var
 mkdir md/usr
 mkdir md/home
 chmod -R 755 md/
 mount /dev/md0 /mnt/md/boot
 mount /dev/md1 /mnt/md/home
 mount /dev/md2 /mnt/md/usr
 mount /dev/md5 /mnt/md/var

11. Edit /etc/fstab to use the appropriate metadevices.

 LABEL=/                 /                       ext3    defaults        1 1
 LABEL=/var              /var                    ext3    defaults        1 2
 LABEL=/usr              /usr                    ext3    defaults        1 2
 LABEL=/home             /home                   ext3    defaults        1 2
 LABEL=/boot             /boot                   ext3    defaults        1 2
 tmpfs                   /dev/shm                tmpfs   defaults        0 0
 devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
 sysfs                   /sys                    sysfs   defaults        0 0
 proc                    /proc                   proc    defaults        0 0
 LABEL=SWAP-sda5         swap                    swap    defaults        0 0
 

becomes

 /dev/md4                /                       ext3    defaults        1 1
 /dev/md5                /var                    ext3    defaults        1 2
 /dev/md2                /usr                    ext3    defaults        1 2
 /dev/md1                /home                   ext3    defaults        1 2
 /dev/md0                /boot                   ext3    defaults        1 2
 tmpfs                   /dev/shm                tmpfs   defaults        0 0
 devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
 sysfs                   /sys                    sysfs   defaults        0 0
 proc                    /proc                   proc    defaults        0 0
 /dev/md3                swap                    swap    defaults        0 0

12. Edit the mtab file to reflect the changes we've made

  vim /etc/mtab

so it looks like

 /dev/md4 / ext3 rw 0 0
 proc /proc proc rw 0 0
 sysfs /sys sysfs rw 0 0
 devpts /dev/pts devpts rw,gid=5,mode=620 0 0
 /dev/md5 /var ext3 rw 0 0
 /dev/md2 /usr ext3 rw 0 0
 /dev/md1 /home ext3 rw 0 0
 /dev/md0 /boot ext3 rw 0 0
 tmpfs /dev/shm tmpfs rw 0 0
 none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0
 sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0

13. Edit /boot/grub/menu.lst to allow fallback to the next kernel in the event of a boot failure:

  vim /boot/grub/menu.lst

and add

 default=0
 fallback=1

14. Add an entry for in menu.lst to allow booting off metadevice root

 title CentOS (2.6.18-194.26.1.el5 metadevices)
         root (hd1,0)
         kernel /vmlinuz-2.6.18-194.26.1.el5 ro root=/dev/md4
         initrd /initrd-2.6.18-194.26.1.el5.img
 title CentOS (2.6.18-194.26.1.el5)
         root (hd0,0)
         kernel /vmlinuz-2.6.18-194.26.1.el5 ro root=LABEL=/
         initrd /initrd-2.6.18-194.26.1.el5.img

15. Rebuild ramdisk to reflect our new reality:

 mv /boot/initrd-`uname -r`.img /boot/initrd-`uname -r`.img_orig
 mkinitrd /boot/initrd-`uname -r`.img `uname -r`

16. Copy the files from disk 1 to the newly built raid

 rsync -aXv ––exclude=/proc/* ––exclude=/sys/* ––exclude=/dev/* ––exclude=/mnt/md / /mnt/md

17. Setup GRUB on both disks

 grub
 Probing devices to guess BIOS drives. This may take a long time.
 
 
     GNU GRUB  version 0.97  (640K lower / 3072K upper memory)
 
  [ Minimal BASH-like line editing is supported.  For the first word, TAB
    lists possible command completions.  Anywhere else TAB lists the possible
    completions of a device/filename.]
 grub> root (hd0,0)
 root (hd0,0)
  Filesystem type is ext2fs, partition type 0x83
 grub> setup (hd0)
 setup (hd0)
  Checking if "/boot/grub/stage1" exists... no
  Checking if "/grub/stage1" exists... yes
  Checking if "/grub/stage2" exists... yes
  Checking if "/grub/e2fs_stage1_5" exists... yes
  Running "embed /grub/e2fs_stage1_5 (hd0)"...  15 sectors are embedded.
 succeeded
  Running "install /grub/stage1 (hd0) (hd0)1+15 p (hd0,0)/grub/stage2 /grub/grub.conf"... succeeded
 Done.
 grub> root (hd1,0)
 root (hd1,0)
  Filesystem type is ext2fs, partition type 0xfd
 grub> setup (hd1)
 setup (hd1)
  Checking if "/boot/grub/stage1" exists... no
  Checking if "/grub/stage1" exists... yes
  Checking if "/grub/stage2" exists... yes
  Checking if "/grub/e2fs_stage1_5" exists... yes
  Running "embed /grub/e2fs_stage1_5 (hd1)"...  15 sectors are embedded.
 succeeded
  Running "install /grub/stage1 (hd1) (hd1)1+15 p (hd1,0)/grub/stage2 /grub/grub.conf"... succeeded
 Done.
 grub> quit

18. Reboot and hope it works!

19. After the reboot, check that the metadevices have been used to boot from

 df -h

should show:

 Filesystem            Size  Used Avail Use% Mounted on
 /dev/md4              4.0G  286M  3.5G   8% /
 /dev/md5               40G  209M   38G   1% /var
 /dev/md2              7.9G  954M  6.6G  13% /usr
 /dev/md1              7.9G  147M  7.4G   2% /home
 /dev/md0               99M   15M   80M  16% /boot
 tmpfs                1006M     0 1006M   0% /dev/shm

20. Run fdisk on the original disk to change all the partitions to RAID volumes

 fdisk /dev/sda
 The number of cylinders for this disk is set to 8924.
 There is nothing wrong with that, but this is larger than 1024,
 and could in certain setups cause problems with:
 1) software that runs at boot time (e.g., old versions of LILO)
 2) booting and partitioning software from other OSs
    (e.g., DOS FDISK, OS/2 FDISK)
 
 Command (m for help): t
 Partition number (1-7): 1
 Hex code (type L to list codes): fd
 Changed system type of partition 1 to fd (Linux raid autodetect)
 
 Command (m for help): t
 Partition number (1-7): 2
 Hex code (type L to list codes): fd
 Changed system type of partition 2 to fd (Linux raid autodetect)
 .
 .
 .
 Command (m for help): w
 The partition table has been altered!
 
 Calling ioctl() to re-read partition table.
 Syncing disks.

21. Add the disk 1 partitions to the meta devices

 mdadm --add /dev/md0 /dev/sda1
 mdadm --add /dev/md1 /dev/sda2
 mdadm --add /dev/md2 /dev/sda3
 mdadm --add /dev/md3 /dev/sda5
 mdadm --add /dev/md4 /dev/sda6
 mdadm --add /dev/md5 /dev/sda7

22. Monitor the re-sync with

 cat /proc/mdstat

until all meta devices have, for want of a better word, synced.

23. Update /etc/mdadm.conf to reflect the new situation

 mdadm --examine --scan > /etc/mdadm.conf

which should result in mdadm.conf looking something like:

 ARRAY /dev/md0 level=raid1 num-devices=2 UUID=98b0f3a8:e65cf4ba:78362d52:4149fec6
 ARRAY /dev/md1 level=raid1 num-devices=2 UUID=ad326160:3c07ea40:2743f05c:01d0ba7d
 ARRAY /dev/md2 level=raid1 num-devices=2 UUID=90db8ba9:a813a3b1:5773fe4e:92575c88
 ARRAY /dev/md3 level=raid1 num-devices=2 UUID=1b3b7297:80b9ffc4:9bf9523a:b9249fe4
 ARRAY /dev/md4 level=raid1 num-devices=2 UUID=7a1f48f9:847a1414:07e67ace:0f90e882
 ARRAY /dev/md5 level=raid1 num-devices=2 UUID=76f18601:8d99c321:b2af7455:316847af

24. Update grub so that in the event that disk 2 should fail, it will boot off disk 1

 vim /boot/grub/menu.lst

so that it looks like this:

 title CentOS (2.6.18-194.26.1.el5 Disk 1)
         root (hd1,0)
         kernel /vmlinuz-2.6.18-194.26.1.el5 ro root=/dev/md4
         initrd /initrd-2.6.18-194.26.1.el5.img
 title CentOS (2.6.18-194.26.1.el5 Disk 0)
         root (hd0,0)
         kernel /vmlinuz-2.6.18-194.26.1.el5 ro root=/dev/md4
         initrd /initrd-2.6.18-194.26.1.el5.img

Every time you're kernel is updated, you'll need to do this again!

25. Now finally update the ramdisk one more time

 mv /boot/initrd-`uname -r`.img /boot/initrd-`uname -r`.img_orig2
 mkinitrd /boot/initrd-`uname -r`.img `uname -r`

26. Reboot and you should be good to go!