This articles covers backup procedures recommended for different types of data including:
- OS and software on the cluster nodes
- Database files
- Files on ACFS, if used
Backing up OS and software on the cluster nodes
It is strongly recommended to back up the OS and software volumes of all cluster nodes after the initial cluster configuration and before and after applying any changes, such as patch installation or security settings changes.
The OS (or root) volume for Flashgrid instances are generally identified as /dev/sda1
The software volume for Flashgrid instances (/u01 filesystem) are generally identified as /dev/xvdz
OS and software on a cluster node can be backed up by one or more of the following methods:
- creating an AMI for the instance
- creating snapshots of individual EBS volumes
- using AWS Backup to run a one-off or schedule backup of EBS volumes
AMI-based backup is recommended as it allows easier way to recover an instance in case it is terminated because of a failure or a human error.
To create a backup AMI for a cluster node:
-
Flush system cache using
# sync
command on the system. - Create AMI for the instance without rebooting the instance.
IMPORTANT! Remove data volumes from the list of volumes that will be backed up to the AMI. Typically only the root volume (/dev/sda1) and the software volume (/dev/xvdz), if present, must be included in the backup AMI. Other volumes must be excluded. Failure to exclude data volumes from the AMI will create inconsistency in ASM disk groups if the data volumes are later restored from the AMI. Data volumes should never be backed up or restored at the volume level. Instead, database/ACFS file backup procedures must be used as described further in this article.
To create a snapshot of individual EBS volumes
- Flush system cache using
# sync
command on the system. - Locate the volume that requires a snapshot via the AWS EC2 Console, and select Actions -> Create Snapshot
AWS Backup for EBS volumes
A one-time (On-demand) or scheduled (Backup plan) backup can be created for the root (/dev/sda1) and software volume (usually /dev/xvdz) for each server in a cluster.
For an On-demand backup, only one EBS volume can be backed up at a time. Choose EBS as the Resource type, and manually select the volume by Volume ID.
For a scheduled backup, either
- add the root & software volumes, identified by Volume ID, in Resource assignments, or
- tag the EBS volumes for backup using unique tag/value combinations, and use this in Refine selection using tags section on the Assign resources page.
Note: We do not recommend using AWS Backup for EC2 as there is no way to exclude data (ASM) disks. Data volumes should never be backed up or restored at the volume level. Instead, database/ACFS file backup procedures must be used as described further in this article
Restoring root or software volume of a cluster node
Root volume of an instance is an EBS volume that has the OS installed on it. Device name of the root volume is /dev/sda1.
Software volume of an instance is an EBS volume where Oracle software binaries are installed (contains the /u01 directory). Typically the software volume has device name /dev/xvdz.
The root volume or the software volume may need to be restored in case the volume fails, has file system corruption, or has logical corruption.
Restoring from a backup AMI image
To restore root volume (/dev/sda1) or software volume (/dev/xvdz)
-
In the backup AMI for the affected instance identify the snapshot id for the affected volume
-
Using the snapshot id, create a new volume in the availability zone where the affected instance is located
-
Stop the affected instance:
-
If the OS is running then stop the instance gracefully using
flashgrid-node poweroff
command -
If the OS is not running then stop the instance using AWS console or CLI
-
-
Detach the affected volume from the instance
-
Attach the newly created volume to the instance using the same device name (/dev/sda1 for root volume, /dev/xvdz for the software volume)
-
Start the instance
Restoring from an AWS Backup (EBS)
To restore root volume (/dev/sda1) or software volume (/dev/xvdz),
- locate the affected volume in AWS Backup console on the Protected Resources page,
- click on the Resource ID that matches the volume id to be restored
- identify the Recovery point ID (i.e. snapshot) by creation time and make a note of the snapshot id (i.e. snapshot/snap-0d548fc13ee54d853)
- open the AWS EC2 console, and go to Elastic Block Store -> Snapshots page. ; search for the snapshot identified in the previous step
- select that snapshot, and choose Actions -> Create volume from Snapshot
- change Volume type to General Purpose SSD (gp3), and ensure that the Size and Availability Zone values are correct
- click on Create volume, and when complete, make a note of the new volume id
-
Stop the affected instance:
- If the OS is running then stop the instance gracefully using
flashgrid-node poweroff
command -
If the OS is not running then stop the instance using AWS console or CLI
- If the OS is running then stop the instance gracefully using
-
Detach the affected volume from the instance
-
Attach the newly created volume to the instance using the same device name (/dev/sda1 for root volume, /dev/xvdz for the software volume)
-
Start the instance
Restoring an instance that was accidentally terminated
Setting instance termination protection is strongly recommended to prevent accidental termination of a cluster node instance.
To restore an instance that was terminated
- Launch a new instance using the backup AMI for the instance that was terminated, OR, if recovering from an AWS Backup (EBS), locate the root volume backup by snapshot id, and create a new AMI from that snapshot
- Make sure that correct instance type, VPC, subnet, and security group are selected.
- Configure the same Primary IP that was used on the terminated instance.
- Specify a placement group corresponding to the cluster
- Make sure that only /dev/sda1 and /dev/xvdz (only on database nodes) volumes are configured. Remove any other volumes if they are present in the AMI.
-
Attach data volumes to the new instance using the same device names (such as xvdba) that were previously used
-
Log in to the instance and bring the data disks online:
$ flashgrid-node online
Backing up and restoring database files
Use standard RMAN procedures for backing up and restoring database files. The two recommended options for backup storage destination are:
- Amazon S3. Provides maximum flexibility with easy shared access to the backup files.
- An EBS volume with a local file system. Provides maximum performance, with up to 500 MB/s of read/write bandwidth on a st1 type of volume.
For information about backing up to S3 see the following documentation from Oracle and AWS:
- White paper: Oracle Database Backup To Cloud: Amazon Simple Storage Service (S3)
- Oracle Secure Backup (OSB) Cloud Module
To configure an EBS volume as a backup storage destination
-
Create an EBS volume in the availability zone where the instance running RMAN is located. st1 volume type is recommended.
-
Attach the volume to the instance running RMAN. Select a device name in the xvdc to xvdg range - disks in this name range will be treated as local and will not be shared by FlashGrid Storage Fabric.
-
Format the volume with a local file system (XFS recommended) and create a mount point for it.
-
Use standard RMAN procedures to configure backup to the local file system.
Note that an EBS volume can be moved only between instances in the same availability zone. However, snapshot of the volume can be used to clone the volume to a different availability zone.
Backing up Grid Infrastructure configuration files
Please follow Backup Best Practices: Grid Infrastructure Configuration
Backing up and restoring files on ACFS
For backing up and restoring files on ACFS use same tools and procedures that you would normally use for file-level backup and restore.