AWS DMS CDC and AWS DMS Limitations for Oracle Sources

In this blog we will examine AWS DMS replication for data from Oracle sources to AWS and AWS DMS limitations for Oracle replication. If you have Oracle databases to migrate to AWS and need to do it without downtime, the AWS Database Migration Service could be the migration solution you are searching for. BryteFlow’s Oracle Replication

Quick Links

AWS Database Migration Service (AWS DMS) Overview

AWS Database Migration Service (AWS DMS) is AWS’s database migration tool. It is a cloud service that enables data migration from relational databases, data warehouses, NoSQL databases, and other data repositories to AWS Cloud destinations securely and fast. AWS DMS supports homogenous and heterogenous migrations and has very little impact on the operation of the source database during the migration, which means there is minimal downtime for applications that rely on the database.

AWS Database Migration Service or AWS DMS migrates your data to and from popular databases – both open source and commercial. AWS DMS continuously replicates data with low latency from source to target. Examples of use cases include replicating data from multiple sources to Amazon S3 to create a scalable data lake or replicating petabytes of data from various databases to Amazon Redshift to build an enterprise data warehouse. It can also migrate databases to Amazon Aurora, Amazon DynamoDB and Amazon DocumentDB, Snowflake on AWS etc. Compare AWS DMS with BryteFlow

AWS DMS can only be used where at least one database resides in AWS – either source or target. With AWS DMS it is possible to do one-time migrations or capture changes continually to sync source and target databases. When solely AWS DMS is used to replicate changes, the time or system change number (SCN) must be specified so AWS DMS can read changes from the database logs. You need to keep the logs accessible on the server for a duration to ensure AWS DMS can access the changes. About AWS ETL

AWS DMS Oracle Support

AWS DMS supports these Oracle database editions: Oracle Enterprise Edition, Oracle Standard Edition, Oracle Express Edition, and Oracle Personal Edition. If you have a self-managed database, AWS DMS supports versions 10.2 onwards (for versions 10.x), 11g and upto 12.2, 18c, and 19c. For Oracle databases on Amazon RDS there is AWS DMS support for versions 11g (versions 11.2.0.4 and later) and upto 12.2, 18c, and 19c. Source

SSL Encryption for Connection from Oracle to AWS

Oracle TDE (Transparent Data Encryption) is supported by AWS DMS to encrypt connections between the Oracle endpoint and replication instance and for encryption of data at rest in source. Reasons, Challenges and Tools in Oracle to SQL Server migration

Steps for setting up an Oracle database as an AWS DMS source

  • Create an Oracle user with suitable permissions so AWS DMS can access the Oracle source.
  • The Oracle source endpoint should conform with the selected Oracle database configuration. No other configuration is required for a ‘full load only’ task.
  • For an AWS DMS replication task that manages change data capture, either ‘CDC only’, or ‘full load plus CDC’, select Oracle LogMiner (default) or AWS DMS Binary Reader to capture incremental changes. Subsequent permissions and configuration options will be based on the choice you make. More on Oracle CDC

Oracle LogMiner Advantages over Binary Reader for AWS DMS CDC

  • Oracle LogMiner supports options for Oracle such as encryption and compression -not all of which are supported by Binary Reader.
  • LogMiner provides simpler configuration as compared to Binary Reader’s direct access setup, and when redo logs are managed with Oracle Automatic Storage Management (ASM).
  • Oracle LogMiner supports table clusters that can be used by AWS DMS unlike Binary Reader.

Workflows for setting up a self-managed Oracle source for AWS DMS CDC using Oracle LogMiner

Here are the workflows you need to follow when you set up Oracle AWS DMS CDC replication to AWS. Source

Grant Oracle account privileges for Self-managed database

For an Oracle database to be used as a source in ASWS DMS, some account privileges need to be provided to the user specified in the Oracle endpoint connection settings. In this case we need to grant Oracle account privileges for a self-managed database. Note: A self-managed database is a local on-premise database instance or an Amazon EC2 database that you configure and manage as compared to an AWS-managed database that resides in an AWS service such as Amazon RDS, Amazon Aurora, or Amazon S3. About Oracle Real-time Replication

When granting privileges, use the exact object names, not something that is similar to the object name. For example, when using V_$OBJECT include the underscore, don’t use V$OBJECT without the underscore – it must be the exact name.

GRANT CREATE SESSION TO db_user;
GRANT SELECT ANY TRANSACTION TO db_user;
GRANT SELECT ON V_$ARCHIVED_LOG TO db_user;
GRANT SELECT ON V_$LOG TO db_user;
GRANT SELECT ON V_$LOGFILE TO db_user;
GRANT SELECT ON V_$LOGMNR_LOGS TO db_user;
GRANT SELECT ON V_$LOGMNR_CONTENTS TO db_user;
GRANT SELECT ON V_$DATABASE TO db_user;
GRANT SELECT ON V_$THREAD TO db_user;
GRANT SELECT ON V_$PARAMETER TO db_user;
GRANT SELECT ON V_$NLS_PARAMETERS TO db_user;
GRANT SELECT ON V_$TIMEZONE_NAMES TO db_user;
GRANT SELECT ON V_$TRANSACTION TO db_user;
GRANT SELECT ON V_$CONTAINERS TO db_user;                  
GRANT SELECT ON ALL_INDEXES TO db_user;
GRANT SELECT ON ALL_OBJECTS TO db_user;
GRANT SELECT ON ALL_TABLES TO db_user;
GRANT SELECT ON ALL_USERS TO db_user;
GRANT SELECT ON ALL_CATALOG TO db_user;
GRANT SELECT ON ALL_CONSTRAINTS TO db_user;
GRANT SELECT ON ALL_CONS_COLUMNS TO db_user;
GRANT SELECT ON ALL_TAB_COLS TO db_user;
GRANT SELECT ON ALL_IND_COLUMNS TO db_user;
GRANT SELECT ON ALL_ENCRYPTED_COLUMNS TO db_user;
GRANT SELECT ON ALL_LOG_GROUPS TO db_user;
GRANT SELECT ON ALL_TAB_PARTITIONS TO db_user;
GRANT SELECT ON SYS.DBA_REGISTRY TO db_user;
GRANT SELECT ON SYS.OBJ$ TO db_user;
GRANT SELECT ON DBA_TABLESPACES TO db_user;
GRANT SELECT ON DBA_OBJECTS TO db_user; -– Required if the Oracle version is earlier than 11.2.0.3.
GRANT SELECT ON SYS.ENC$ TO db_user; -– Required if transparent data encryption (TDE) is enabled.

Grant the additional following privilege for each replicated table when using a specific table list.
GRANT SELECT on any-replicated-table to db_user;

Grant the additional following privilege to validate LOB columns with the validation feature.
GRANT EXECUTE ON SYS.DBMS_CRYPTO TO db_user;

Grant the additional following privilege to expose views.
GRANT SELECT on ALL_VIEWS to dms_user;

To expose views, you must also add the exposeViews=true extra connection attribute to your source endpoint.

The source database needs to be prepared for AWS DMS CDC replication

The first step is to verify that AWS DMS supports the source database version with this script.

SELECT name, value, description FROM v$parameter WHERE name = ‘compatible’;

The name, value, and description are columns that are part of the database and are being queried based on the value of name. If the query runs without error, it means the current version of the database is supported by AWS DMS CDC and the migration can be continued. If the query returns an error, AWS DMS doesn’t support the database version, you will need to first convert the Oracle database to a version supported by AWS DMS. More on Oracle CDC

Ensure that the ARCHIVELOG mode is on

You can run Oracle in two different modes: the ARCHIVELOG mode and the NOARCHIVELOG mode. For running an AWS DMS CDC task, run the database in ARCHIVELOG mode. To ascertain if the database is in ARCHIVELOG mode, execute the following query:

SQL> SELECT log_mode FROM v$database;

If query returns NOARCHIVELOG mode, set the database to ARCHIVELOG according to Oracle instructions.

Set up supplemental logging

Provide additional Oracle user privileges required for AWS DMS CDC

When using Oracle LogMiner to access redo logs, you need to grant further account privileges to the user specified in Oracle endpoint connection settings.

Account privileges required when using Oracle LogMiner to access the redo logs

GRANT EXECUTE on DBMS_LOGMNR to db_user
GRANT SELECT on V_$LOGMNR_LOGS to db_user
GRANT SELECT on V_$LOGMNR_CONTENTS to db_user
GRANT LOGMINING to db_user; -– Required only if the Oracle version is 12c or later Source

AWS DMS Limitations for Oracle Replication

When replicating Oracle database to AWS using AWS DMS, the following AWS DMS limitations must be kept in mind:

  • Overall, the AWS Database Migration Service lacks the power to process complex stored procedures. Another limitation is configuration restrictions. AWS DMS cannot handle some configurations like HTTP/ HTTPS and SOCKS proxy. These configurations are an AWS DMS limitation and are not supported by AWS DMS.
  • Oracle Extended Data Types are not supported by AWS DMS currently.
  • Long object names (more than 30 bytes) are not supported by AWS DMS
  • AWS DMS does not support indexes that are function-based
  • AWS DMS does not have support for the multi-tenant container root database (CDB$ROOT). However, a PDB can be supported using Binary Reader
  • Deferred constraints are not supported by AWS DMS
  • Oracle SecureFile LOBs (Large Objects) are not supported by AWS DMS. The SecureFiles functionality enables storage allowing for encryption for security purposes and saves space by using compression and deduplication.  
  • Rename table table-name is supported to new-table-name syntax for all supported Oracle versions 11 and onward. Oracle version 10 source databases however do not support the syntax.
  • Data changes that result from partition or sub-partition operations (ADD, DROP, EXCHANGE, and TRUNCATE) cannot be replicated by AWS DMS. These updates may cause errors during the replication process.
  • For replication of changes ensuing from partition or sub-partition operations, reload the tables involved, after adding a new empty partition, operations on it can be replicated to the destination as usual.
  • Data changes that originate from running CREATE TABLE AS statement on the source are not supported by AWS DMS. A new table can be created on target, however.
  • Changes made by the Oracle DBMS_REDEFINITION package are not captured by AWS DMS, for e.g. the table metadata and the the OBJECT_ID field.
  • In the Oracle database empty BLOB and CLOB columns are mapped to NULL on target by AWS DMS. BLOB (Binary Large Object) datatype is used for storing unstructured binary large objects e.g. multimedia files – images, audio, and video. The CLOB (Character Large Object) datatype stores textual data in the database character set. It supports both – fixed-width and variable-width character sets. This data type is used for storing documents, very large strings, or XML documents without schema.
  • If Oracle 11 LogMiner is used to capture changes, update on a CLOB column with a string length more than 1982 will be lost and the target will not be updated.
  • AWS DMS does not support batch updating of numeric columns defined as a primary key.
  • AWS DMS doesn’t support certain UPDATE commands.
  • AWS DMS truncates data longer than 64KB in LONG or LONG RAW columns to 64 KB – this is an issue since LONG and LONG RAW columns of an Oracle database can contain up to 2 GB data.
  • Tables with table names that have apostrophes cannot be replicated by AWS DMS.
  • CDC from Dynamic Views is not supported by AWS DMS.
  • Index-organized tables with an overflow segment are not supported for AWS DMS CDC.

Compare AWS DMS Replication with BryteFlow

AWS DMS limitations while using Oracle LogMiner to access the redo logs:

  • AWS DMS cannot replicate changes to LOB columns when Oracle 12 is the source database.
  • UPDATE operations on XMLTYPE and LOB columns cannot be replicated by AWS DMS for any Oracle version
  • Results of the DDL statement ALTER TABLE ADD column data_type DEFAULT default_value cannot be replicated by AWS DMS. The new column is set to NULL rather than replicating the default_value.
  • For a new column that is nullable, Oracle updates the table rows before DDL is logged thus the changes are captured by AWS DMS but not updated in the target.
  • XA transactions in replication are not supported by AWS DMS when Oracle LogMiner is used.
  • Connections to an Amazon RDS Oracle source are not supported by AWS DMS when using an Oracle Automatic Storage Management (ASM) proxy.
  • Virtual columns are not supported by AWS DMS.
  • ROWID data type or materialized views based on a ROWID column are not supported by AWS DMS.
  • Global temporary tables cannot be captured or loaded by AWS DMS.
  • When a row with a composite unique key that contains null has an update, it cannot be replicated at target.
  • Multiple Oracle TDE encryption keys on the same source endpoint are not supported by AWS DMS. Every endpoint can have one attribute only for TDE encryption Key Name "securityDbEncryptionName", and one TDE password for the key.

Source

BryteFlow’s Oracle Replication as compared to AWS DMS Replication

  • Using AWS DMS to replicate Oracle data to AWS involves a sizeable amount of coding. BryteFlow replicates Oracle data with zero coding for any process including SCD Type2, masking and tokenization. Compare AWS DMS with BryteFlow
  • Unlike AWS DMS, BryteFlow provides preserves the history of every transaction and provides time-stamped data.
  • Both AWS DMS and BryteFlow provide Oracle replication with CDC (Change Data Capture). However, the CDC process is completely automated with BryteFlow while AWS DMS CDC requires coding. More on Oracle CDC
  • BryteFlow provides automated partitioning on the Amazon S3 data lake by configuring columns automatically. File types and compression supported include – ORC (snappy), ORC(zlib), Parquet(snappy), gzip, bzip2. AWS DMS does not provide automated partitioning and compression. How to choose between ORC, AVRO and Parquet
  • AWS DMS delivers deltas to S3 while BryteFlow delivers delta files and fully merged data. BryteFlow orchestrates EMR to merge incremental data automatically with data on S3 and provides automated upserts without using Apache Hudi or other third-party tools. AWS DMS needs Apache Hudi-EMR integration to apply and merge the changes (upsert) with the target tables, which requires a lot of coding. Create a Data Lakehouse on Amazon S3
  • AWS DMS has limited support for extraction of large volumes of data whereas BryteFlow Ingest replicates petabytes of data with high throughput in real-time to AWS. BryteFlow XL Ingest offers smart partitioning and multi-threaded parallel loading for fast initial full bulk loads.
  • AWS DMS does not handle schema and code conversion well. When migrating data as part of a homogenous migration, it tries to create a target schema at destination. This isn’t always possible as in the case of Oracle databases. You may need third party tools to create schema. AWS DMS cannot do schema conversions in the case of heterogenous migrations and additional effort will be involved using the AWS Schema Conversion Tool (SCT). BryteFlow provides easy file configuration, compression and ready-to-use data on target. See how BryteFlow Works
  • Maintaining AWS DMS on an on-going basis can be challenging at times with daily intervention required in some environments, especially when records go missing in the replication. With BryteFlow you can automatically reconcile data with the Oracle source to validate completeness. It provides automated DDL and creates tables automatically on the target in AWS.

Get a Free Trial of BryteFlow