Replacing a failed disk in a software mirror

From Peter Pap's Technowiki
Revision as of 23:36, 14 November 2010 by Ppapa (talk | contribs) (Created page with "So here's the scenario, you have a server with disk mirrored with Solaris Disksuite, c0t0d0 and c0t1d0. The disk c0t1d0 has failed and you want to replace it without shutting do...")

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

So here's the scenario, you have a server with disk mirrored with Solaris Disksuite, c0t0d0 and c0t1d0. The disk c0t1d0 has failed and you want to replace it without shutting down the box. We have the following metadevices and sub-mirrors:

d0: d1 and d2 d10: d11 and d12 d20: d21 and d22 d30: d31 and d32 d40: d41 and d42 d50: d51 and d52

This procedure had been tested and works on a Sunfire V120. You may have to alter this slightly depending on the hardware you are running.

1. Delete the meta databases stored on the failed disk, stored in this case on slice 7 of the disk

 metadb -d c0t1d0s7

2. Detach the sub-mirrors, from the failed disk, from the meta devices

 metadetach -f d0 d2
 metadetach -f d10 d12
 metadetach -f d20 d22
 metadetach -f d30 d32
 metadetach -f d40 d42
 metadetach -f d50 d52

The -f option is necessary as you will need to force this to happen as the disk has failed.

3. Clear the meta-devices that we associated with the failed disk

 metaclear d2
 metaclear d12
 metaclear d22
 metaclear d32
 metaclear d42
 metaclear d52

4. Find the correct Ap_Id for the failed disk, with the cfgadm command

 cfgadm -al
 Ap_Id                          Type         Receptacle   Occupant     Condition
 c0                             scsi-bus     connected    configured   unknown
 c0::dsk/c0t0d0                 disk         connected    configured   unknown
 c0::dsk/c0t1d0                 disk         connected    configured   unknown
 c1                             scsi-bus     connected    unconfigured unknown
 usb0/1                         unknown      empty        unconfigured ok
 usb0/2                         unknown      empty        unconfigured ok

5. Unconfigure the device so that you can remove it

 cfgadm -c unconfigure c0::dsk/c0t1d0