FlashGrid recommends deployment of new Oracle Linux 8 based clusters for migrating FlashGrid Cluster from Oracle Linux 7 to Oracle Linux 8.
This article provides steps tested by FlashGrid for upgrading OS in a FlashGrid Cluster from Oracle Linux 7 to Oracle Linux 8 on AWS for those customers who have to use an in-place upgrade instead of deploying a new cluster.
WARNINGS:
- The steps listed in this article are provided for reference only. FlashGrid cannot guarantee that these steps will work on your systems.
- The upgrade steps were tested by FlashGrid for upgrading from OL version 7.9 to 8.8 starting with vanilla FlashGrid Cluster version 23.04 OS images for AWS.
- Third-party software installed on the systems may require additional steps.
- Customized OS settings may require additional steps.
- Customized security settings (e.g. security hardening applied) may require additional steps.
- FlashGrid strongly recommends testing the entire upgrade procedure on non-production systems that are configured identical to your production systems before upgrading the production systems.
- It is customer's responsibility to have a recovery plan in case of unexpected issues during or after the upgrade.
- ACFS support depends on the Grid Infrastructure and kernel version as documented in: ACFS and AFD Support On OS Platforms (Certification Matrix). (Doc ID 1369107.1). It might be required to update GI, or apply a patch, or switch to a different kernel to make ACFS work after the upgrade.
Preparing for the upgrade
Set system locale
-
Verify that the system locale is set to en_US.UTF-8.
cat /etc/locale.conf
If the locale is not set correctly, use the command
sudo localectl set-locale LANG=en_US.UTF-8
to configure this.
Update to latest packages
- Update OS packages
- Update FlashGrid packages. Note: keep the node update script as it is needed to finalize the upgrade.
Install the Leapp Utility
Install the Leapp utility and its supporting repositories:
sudo yum install -y leapp --enablerepo=ol7_leapp,ol7_latest
Run the pre-upgrade check
-
Oracle Linux
sudo leapp preupgrade --oraclelinux
-
Examine the leapp-report.txt file.
cat /var/log/leapp/leapp-report.txt
-
Examine the answerfile.
cat /var/log/leapp/answerfile
Use the
sudo leapp answer
command to provide the answer True to the[remove_pam_pkcs11_module_check]
PAM module item.sudo leapp answer --section remove_pam_pkcs11_module_check.confirm=True
-
Verify the answerfile has been modified.
sudo cat /var/log/leapp/answerfile
Example output:
[remove_pam_pkcs11_module_check] confirm = True
Note: Make sure to resolve all items in the answerfile and any high risk Risk Factor: high (inhibitor)
entries in the leapp-report.txt file before proceding with the upgrade.
Upgrade the system
-
Create backup snapshot of the OS disk
a. Flush OS buffers:
sync
b. Create snapshot of the OS disk using the cloud console or CLI.
-
Make sure there are no other nodes that are in offline or re-syncing state. All disk groups must have zero offline disks and Resync = No:
flashgrid-cluster
-
If the node is a database node,
a. Stop all local database instances running on the node.
b. Stop Oracle CRS on the node:
crsctl stop crs
-
Stop FlashGrid Storage Fabric services on the node:
flashgrid-node stop
-
Remove packages conflicting with the new version:
yum remove -y python36-PyYAML rpm -e --nodeps openssl11-libs
-
Remove the immutable attribute on the GRUB configuration file that needs to be modified during the upgrade:
chattr -i /etc/default/grub
-
Run the upgrade process:
sudo leapp upgrade --oraclelinux
The upgrade process will take approximately 10 minutes, and returns to the command prompt when finished.
-
When the upgrade completes, reboot the system.
sync; sync reboot
The reboot will disconnect the SSH connection. During the boot process, the Leapp process automatically upgrades packages. The upgrade operation also includes multiple automatic reboots. You will not be able to reconnect the SSH session until all the reboots have completed. Wait approximately 15 minutes and then reconnect the SSH session to the system. If the connection fails, wait a few minutes and try again.
Post upgrade steps
-
Set python3 as default:
alternatives --set python /usr/bin/python3
-
Restore the immutable attribute for the GRUB configuration file:
chattr +i /etc/default/grub
-
Remove FlashGrid EL7 packages:
yum remove -y earlyoom fgssh flashgrid-clan
-
Enable EPEL:
yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
-
Install additional packages:
yum -y install sdparm
-
Restore
/etc/dracut.conf
:cp /etc/dracut.conf.rpmnew /etc/dracut.conf
The file content should be as follows:
# PUT YOUR CONFIG IN separate files # in /etc/dracut.conf.d named ".conf" # SEE man dracut.conf(5) for options
- Rebuild initramfs:
dracut -f
-
Reinstall FlashGrid packages using the node update script (as root):
FG_COMMON_DEVICE_MODE=432 bash flashgrid_cluster_node_update-X.Y.Z.NNNNN.sh force-cluster
Note: Ignore the following error during the script execution:
./update.sh: line NNN: /opt/flashgrid/bin/flashgrid-cfg-parser: No such file or directory
-
Enable FlashGrid services:
systemctl enable flashgrid-clan systemctl enable flashgrid-clan-wait
-
Reboot the node:
sync; sync reboot
-
Confirm that all Clusterware/Database services are started successfully.
-
Reset RPM/config files/services:
flashgrid-health-check reset-cfg-list flashgrid-health-check reset-rpm-list flashgrid-health-check reset-services-list
-
Follow this article if you want to switch to a UEK kernel: How to switch from RH kernel to UEKR6 after deployment