Several different types of problems may result in unexpected node reboots.
Possible cause | How to confirm |
---|---|
Physical host hardware failures | If on-premises, check the hardware System Event Log (SEL). |
VM or hypervisor failures | Check the hypervisor or the cloud VM logs. |
OS kernel crash | Check dmesg or /var/log/messages . In the cloud also check the serial console log. |
Out of memory | Check /opt/flashgrid-diags/log/node_monitor-all.log on the host that was rebooted for the presence of Low Memory warnings just before the time of reboot. |
In the cluster deployments, the node is evicted and rebooted by Oracle Clusterware because of network disruption on the node or between the nodes | Check the CRS alert log files for the reason of eviction. Check /opt/flashgrid-diags/log/node_monitor-all.log on all nodes to confirm network ping loss. |
In the cluster environments, the node is evicted and rebooted by Oracle Clusterware because of excessive CPU load on the surviving node | Check /opt/flashgrid-diags/log/node_monitor-all.log on the surviving node for CPU load warnings. This type of eviction is more likely to happen if database nodes have very small VM sizes, such as only 2 physical cores. |