Limitations and considerations
The following limitations apply:
- UPDATE/DELETE DMLs are not supported during change processing. If an UPDATE/DELETE DML was captured on the source, it will be ignored on the target and a warning will be written to the log. If the Store Changes option is enabled in the task settings, these records will be written to the Change Table.
- Limited LOB support only.
-
Unsupported DDL operations:
-
Changes to a column's data type or data type length will not be captured and will result in the table being suspended.
- A REMOVE COLUMN operation on the source will set NULL values in the corresponding target column.
- A RENAME COLUMN operation on the source will set NULL values in the corresponding target column.
- The Rename Table DDL is not supported when using AWS Glue Catalog as the Hive metastore. Tables updated with Rename DDL operations will be suspended.
Information noteWhen Replicate is set to ignore DDL changes, ADD COLUMN, RENAME TABLE, DROP TABLE, and TRUNCATE TABLE DDL operations will be ignored. Unsupported DDL operations will not be ignored, but they will also not be applied to the target. Instead, they will behave as described above.
-
- The following Control Tables are not supported as they require UPDATE/DELETE operations, which are not supported:
Replication Status (requires UPDATE)
Name on target:
attrep_status
Suspended Tables (requires DELETE)
Name on target:
attrep_suspended_tables
- Table and column names can only contain ASCII characters. Column names cannot contain the '?' symbol. If needed, the '?' symbol can be replaced with a valid ASCII character using a transformation.
- Connecting via an HTTP proxy server with authentication is not supported.
- Replication to Delta tables is not supported.
- The Batch optimized apply Change Processing mode is not supported.
- The Commit rate during full load option is not relevant.
- The Create primary key or unique index after full load completes option is not relevant.
- If you change the Databricks storage access method endpoint setting during a task or create a new task with a Databricks target whose Databricks storage access method differs from that of existing tasks, you must also perform the procedures described in Setting general connection properties.
-
As support for the VARCHAR data type was introduced in Databricks 8, creating a task that replicates to Databricks 8.x and then downgrading the cluster to an earlier Databricks version (in the same task) requires the task to be reloaded.
-
Limitations when using Parquet file format:
-
When the DDL Handling Policy for source tables is set to Ignore ALTER, the RENAME COLUMN DDL (on a Primary Key column) is not supported.
-
LOB columns larger than 1 MB are not supported.
-
-
Unity Catalog limitations and considerations:
-
When Change Data Partitioning is turned on in the Store Changes Settings tab, Replicate will not create actual partitions in Databricks. Instead, it will simulate partitions by copying the Change Tables data files to subfolders.
-
When both Use Unity Catalog and All-purpose clusters are selected (in the endpoint settings' General tab), the Sequence storage format will not be available for selection. Choose Text or Parquet instead (in the Advanced tab).
Information noteIf you want to enable the Use Unity Catalog and All-purpose clusters options, but already have existing target tables in Sequence format, you need to do the following:
- Stop the task (only relevant for running tasks).
- Select the Target storage format in the endpoint settings (Text or Parquet).
- Manually migrate the target tables to be the same as the selected storage format.
- Resume the task (only relevant for running tasks).
If you fail to do this, Replicate will suspend any tables in Sequence format, which will require you to reload the target.
-