All customers are recommended to apply the fix.
Summary: flashgrid-node offline/reboot/stop/poweroff commands do not gracefully offline disks when re-sync is in progress on those disks.
Symptoms:
flashgrid-node offline/reboot/stop/poweroff commands report OK status for offlining disks, but the disks are not offlined. Subsequent service shutdown or node reboot results in the disks taken offline by ASM non-gracefully.
The issue occurs only if the disks are in re-sync state at the time when the command is issued. For example, this might happen if a node is rebooted twice within a short period of time and before the re-sync operation completes following the first reboot.
Affected products:
FlashGrid Storage Fabric
FlashGrid Cluster on Azure/AWS/GCP
Affected versions:
flashgrid-sf RPM versions below 20.8.165
Root Cause:
Only disks with MODE_STATUS = ONLINE were offlined, while disks with MODE_STATUS = SYNCING were skipped.
Workaround:
Avoid stopping flashgrid service or rebooting/stopping a node while flashgrid-cluster command shows that any of the disk groups has Resync operation in progress.
Resolution:
Upgrade flashgrid-sf RPM to the version 20.8.165 or newer on all nodes.
References:
To determine currently used version of the flashgrid-sf RPM, run rpm -qa | grep flashgrid-sf
The RPM update procedure must follow the recommended steps described in the knowledge base articles: