To replace a failed SSD in a running cluster
- Use
flashgrid-cluster drives
command to determine the following information about the failed SSD:
- FlashGrid name of the SSD, e.g. rac2.failedserialnumber
- ASM name of the SSD, e.g. RAC2$FAILEDSERIALNUMBER
- slot number where the SSD is installed
- whether the ASM disk is online, offline, or dropped (ASMStatus=N/A)
-
Drop the failed SSD from the ASM disk group if it has not been dropped yet. Examples:
a. If the failing ASM disk is still online:
SQL> alter diskgroup MYDG drop disk RAC2$FAILEDSERIALNUMBER rebalance wait;
b. If the failed ASM disk is offline, but has not been dropped by ASM:
SQL> alter diskgroup MYDG drop disk RAC2$FAILEDSERIALNUMBER force;
-
Physically remove the failed SSD.
-
Plug in a new SSD in the same drive slot.
-
Use
flashgrid-node
command to determine its FlashGrid name, e.g. rac2.newserialnumber
-
Add the new SSD to the ASM disk group that the failed SSD was in. Example:
$ flashgrid-dg add-disks -G MYDG -d /dev/flashgrid/rac2.newserialnumber
If you have to re-add the same SSD that was used before or add a different SSD that already has ASM metadata on it then you have to add it using the force option -f
. Example:
$ flashgrid-dg add-disks -G MYDG -d /dev/flashgrid/rac2.newserialnumber -f