This article describes steps required to make use of Real Application Testing (RAT) to assess the overall impact of system changes, such as Oracle version upgrades or on-premise to cloud migrations.
It is possible to use Oracle Enterprise Manager (OEM) for Workload Replay and Capture. This method is not discussed here. Instead, PL/SQL API is described in this article because it can be easily automated, and put in scripts.
Note: RAT requires a license from Oracle.
Capturing a Database Workload
There are two options to capture the workload - either in the Root container or in a pluggable database. It is recommended to run the workload capture in the Root container. In case you would like to capture the workload in a pluggable database, then you need to be aware about "ORA-20027 Workload Repository RAC Compare Period Report is not supported in this container" on 19.7 when running AWR Global Diff Report (dbms_workload_repository.awr_global_diff_report_html) (Doc ID 2730820.1).
Before Capturing a Database Workload
It is better to replay the captured workload on the exact copy of the capture database. We recommend making a copy of the capture database using the RMAN DUPLICATE command.
Creating a Capture Directory
It is needed to have a shared location that is accessible from all RAC nodes. An ACFS filesystem is a possible option. There should be a database directory pointing to that location. It will be used to store capture files. For example:
create or replace directory wrc_dir as '/acfsmounts/acfs1/wrc';
If you would like to estimate the amount of space needed for the workload capture, then we recommend approximating it as follows:
- Capture the workload for some short time period that is a good representation of the overall workload;
- Approximate the amount of space needed based on what is required for the sample period;
Starting a Workload Capture
It is advised to capture the workload for a short time period, 10-15 minutes. It is often the case that the captured workload is replayed multiple times. Therefore, it is much quicker to replay a 10-15 minute period, rather than several hours of your production workload. To start a workload capture, run the following code:
begin dbms_workload_capture.start_capture( name => 'capture_test', dir => 'WRC_DIR', duration => 300, plsql_mode => 'extended'); end; /
- It is better to start the procedure when there are no or few in-flight transactions. Otherwise, such transactions can adversely affect data divergence during the subsequent replays.
- Either a fixed duration can be set, such as 300 seconds in the example above, or DBMS_WORKLOAD_CAPTURE.FINISH_CAPTURE should be called later on.
- PLSQL_MODE: it is set to extended due to the following reasons:
- it often leads to more successful replays (less data divergence);
- it is still possible to replay such workload with PLSQL_MODE=top_level if needed.
- Some activities are not captured. Please refer to the Oracle documentation for the full list.
- It is not recommended to add any filters at the capture stage (e.g. ADD_FILTER procedures). It is easy to miss something, and filters can be applied at the replay stage as well.
Refer to the documentation of the START_CAPTURE procedure for additional information: link.
Gathering Data about a Workload Capture
Once the workload has been captured, it is recommended saving the contents of DBA_WORKLOAD_CAPTURES:
select * from dba_workload_captures;
- START_SCN: it will be used to instantiate the source database at this change number.
- It is also worth checking other parameters from this view - TRANSACTIONS, %CALLS, ERRORS - they can give an idea about the workload: DBA_WORKLOAD_CAPTURES
Then, please export the AWR data using an appropriate CAPTURE_ID from DBA_WORKLOAD_CAPTURES:
exec dbms_workload_capture.export_awr (capture_id => 1)
Replaying a Database Workload
Creating the Replay System
The replay database should be instantiated at the same SCN that is DBA_WORKLOAD_CAPTURES.START_SCN from the previous section. The recommended way for this is to run the RMAN DUPLICATE command. Before proceeding with the further steps, please make sure that there is a new restore point created. It can be used later on if there is a need to replay the same workload more than once.
Initializing the Replay Session
begin dbms_workload_replay.process_capture(capture_dir => 'WRC_DIR', plsql_mode => 'extended'); end; / begin dbms_workload_replay.initialize_replay(replay_name => 'replay_test1', replay_dir => 'WRC_DIR', plsql_mode => 'extended'); end; /
It is possible here to adjust the PLSQL_MODE as needed. Find the value that works best for your environment. top_plsql is a safe default. PLSQL_MODE=extended was introduced later on, and we found that it sometimes provides worse results than top_plsql.
Remapping Database Connections
RAC is quite sensitive to inter-instance communication. Therefore, it is recommended to replay the workload, in the same manner, it was running on the capture system. The sessions should be mapped to the same instances they were in the original environment. Although it is possible to map all sessions to the SCAN address, it will result in an even load distribution that might differ from the capture system. Use DBA_WORKLOAD_CONNECTION_MAP to see the current mappings:
select * from dba_workload_connection_map;
Use DBMS_WORKLOAD_REPLAY.REMAP_CONNECTION to adjust them:
begin dbms_workload_replay.remap_connection(connection_id => 1, replay_connection => 'replay-rac2:1522/pdb'); dbms_workload_replay.remap_connection(connection_id => 2, replay_connection => 'replay-scan/pdb'); dbms_workload_replay.remap_connection(connection_id => 3, replay_connection => 'replay-rac1:1522/pdb'); end; /
Preparing the Replay Files
This is another step that can be adjusted if there is insufficient replay performance:
begin dbms_workload_replay.prepare_replay( synchronization => 'SCN', --think_time_scale => 50, rac_mode => dbms_workload_replay.global_sync --rac_mode => dbms_workload_replay.per_instance_client ); end; /
- synchronization: start with SCN. If it does not work well and results in a slow replay, then switch it to TIME
- rac_mode: DBMS_WORKLOAD_REPLAY.GLOBAL_SYNC is the default option. If the synchronization causes high overhead, switch it to DBMS_WORKLOAD_REPLAY.PER_INSTANCE_CLIENT
- think_time_scale: it can be used to speed-up the replay execution when needed.
Warming up the Replay system
When the replay instances just start, their buffer caches are empty. Oracle will be aggressively prefetching the data if we run the replay at that time. This can distort the replay results and will likely put the replay system at unfair disadvantage.
Therefore, it is recommended first to warm up the replay system. If there are any large cached tables that are heavily accessed in the capture system, make sure to cache them in the replay system too before running the replay.
Then, the capture database can already have some object affinity in place, as well as read-mostly objects. None of these can be present in the replay database which could incur additional global cache overhead.
At least, set the read-mostly flag for any read-mostly objects you have in your capture database. Find them using the query below:
select obj.object_name, obj.owner, obj.object_type, gcs.inst_id, gcs.data_object_id, gcs.gc_mastering_policy, gcs.current_master, gcs.previous_master, gcs.remaster_cnt, gcs.total_access, gcs.remote_access from gv$gcspfmaster_info gcs, cdb_objects obj where obj.con_id = gcs.con_id and obj.data_object_id = gcs.data_object_id order by gcs.data_object_id, gcs.inst_id;
And run the SET_READMOSTLY procedure for all read-mostly objects:
exec dbms_cacheutil.grab_readmostly('OWNER', 'OBJECT_NAME', null, false)
Then run again the query against GV$GCSPFMASTER_INFO to make sure the read-mostly property has been set. If not, query the objects, which can be tables or indexes, and check again.
As an alternative to this method, you can create a capture subset using the GENERATE_CAPTURE_SUBSET procedure. Run the subset first, before running the remaining workload. The sole purpose of this subset is to prepare the database for the main workload that will be assessed: loading the data into the buffer cache and setting the object affinity. Please note: RAC global cache decisions are made every 20 minutes by default (the _gc_policy_time parameter). There are no more than 100 objects processed each time (the _lm_drm_max_requests parameter). Therefore, select the duration of the subset workload accordingly.
The rule of thumb is to use as many wrc clients as calibrate suggests:
wrc mode=calibrate replaydir=/acfsmounts/acfs1/wrc
We have seen two types of issues caused by an insufficient number of replay clients:
- High memory consumption by the replay clients
- Slow database replays
Starting Workload Replay Clients
Start as many replay clients as was suggested by the calibration procedure above:
-- If rac_mode is dbms_workload_replay.global_sync wrc system/Oracle123@//replay-scan/orcl mode=replay replaydir=/acfsmounts/acfs1/wrc -- If rac_mode you've chosen before is dbms_workload_replay.per_instance_client wrc system/Oracle123@//replay-rac1:1522/orcl mode=replay replaydir=/acfsmounts/acfs1/wrc
begin dbms_workload_replay.start_replay(); end; /
Monitoring Replay Sessions
Replay sessions are usual database sessions and can be monitored using the same tools. There are also a couple of replay specific views, namely:
Analyzing the Replay Results
Generating the Replay Report
After the replay finished, generate the replay report using the DBMS_WORKLOAD_REPLAY.REPORT function:
select dbms_workload_replay.report(1, 'HTML') from dual;
Generating the AWR Reports
Load the capture AWR reports into the database:
create user c##capture_awr identified by capture_awr; grant unlimited tablespace to c##capture_awr; select dbms_workload_capture.import_awr (capture_id => 1, staging_schema => 'C##CAPTURE_AWR') from dual;
Generate an AWR Global Compare period report:
Or using a PL/SQL function:
select * from DBMS_WORKLOAD_REPOSITORY.AWR_GLOBAL_DIFF_REPORT_TEXT ( dbid1 => 591014135, inst_num1 => cast(null as varchar2(1)), bid1 => 12, eid1 => 13, dbid2 => 992088716, inst_num2 => null, bid2 => 144, eid2 => 145);
Or using an SQL*Plus script:
define num_days = 3; define dbid = 4; define instance_numbers_or_ALL = '1,2,3'; define begin_snap = 10; define end_snap = 11; define dbid2 = 1345; define instance_numbers_or_ALL2 = '9,12,15'; define begin_snap2 = 120; define end_snap2 = 121; define report_type = 'text'; define report_name = /tmp/compreport1.txt @?/rdbms/admin/awrgdrpi