All customers are recommended to apply the update.
FlashGrid software update must be applied to prevent incorrect handling of certain failure scenarios. The update modifies the following parameters for improved failure handling:
- nvme io_timeout parameter changed from 30 seconds to 4bln (unlimited) on RHEL/OL 7, prevents system freezes during excessive (>30 sec) delays of I/O on AWS EC2 Nitro-generation systems.
- Initiator noop timeout changed from 35 to 18 seconds on Azure-based clusters, improves failure handling by Oracle Clusterware under certain conditions.
- Initiator replacement timeout changed from 6 to 3 seconds, improves failure handling by Oracle Clusterware under certain conditions.
Certain failure scenarios may be handled incorrectly by the OS and clustering services, potentially resulting in database downtime.
FlashGrid Cluster on AWS
FlashGrid Cluster on Azure
FlashGrid Cluster on GCP
Affected versions and configurations:
FlashGrid Cluster software version 21.03 or earlier.
To determine currently used version of the software, run
rpm -qa | grep flashgrid-skycluster
Note that versions will be listed as 21.3 instead of 21.03.
Storage timeout parameters did not account for certain failure scenarios.
Update FlashGrid Cluster software using flashgrid_node_update package of version 21.06 or newer. See update instructions here.