Analytics repository schema
In addition to the data provided by the Enterprise Manager Analytics dashboards, organizations can also use their own BI tools to generate reports based on the available metrics. To assist with this, the following topic provides a description of the Analytics repository schema together with a couple of sample queries.
The Analytics repository contains the following tables:
- aem_endpoint_type
- aem_meta_source_database
- aem_meta_target_database
- aem_server
- aem_source_database
- aem_target_database
- aem_target_processes
- aem_task
- aem_task_name
- aem_task_previous_metrics
- aem_task_profile
- aem_task_state
- aem_task_stop_reason
aem_endpoint_type
The aem_endpoint_type table provides lookup information about the Replicate endpoints.
Column | Data Type | Description |
---|---|---|
endpoint_type_id |
Integer |
The endpoint type ID. |
endpoint_type_name |
Character varying(1024) |
The endpoint type. |
aem_meta_source_database
The aem_meta_source_database table provides lookup information about the source endpoint.
Column | Data Type | Description |
---|---|---|
meta_source_database_id |
Integer |
The source endpoint ID. |
meta_source_database_name |
Character varying(1024) |
The source endpoint name. |
endpoint_type_id |
Integer |
The endpoint type ID. |
aem_meta_target_database
The aem_meta_target_database table provides lookup information about the target endpoint.
Column | Data Type | Description |
---|---|---|
meta_target_database_id |
Integer |
Target Endpoint ID |
meta_target_database_name |
Character varying(1024) |
Target Endpoint Name |
endpoint_type_id |
Integer |
Endpoint Type ID |
aem_server
The aem_server table provides lookup information about the Replicate server.
Column | Data Type | Description |
---|---|---|
server_id |
Integer |
The Replicate server ID. |
server_name |
Character varying(1024) |
The Replicate server name. |
Host |
Character varying(1024) |
The host name or IP address. |
port |
Integer |
The Replicate port. |
platform |
Character varying(1024) |
The Replicate server platform. |
version |
Character varying(1024) |
The Replicate version. |
aem_source_database
The aem_source_database table provides information about the source endpoint used in a specific task.
Column | Data Type | Description |
---|---|---|
task_id |
Integer |
The task run ID. |
source_database_id |
Integer |
The source endpoint ID. |
source_database_name |
Character varying(1024) |
The source endpoint name. |
endpoint_type_id |
Integer |
The endpoint Type ID. |
aem_target_database
The aem_target_database table provides information about the target endpoint used in a specific task.
Column | Data Type | Description |
---|---|---|
task_id |
Integer |
The task run ID. |
target_database_id |
Integer |
The target endpoint ID. |
target_database_name |
Character varying(1024) |
The target endpoint name. |
endpoint_type_id |
Integer |
The target endpoint type ID. |
aem_target_processes
The aem_target_processes table is used internally to monitor the Enterprise Manager Analytics processes such as the Collector and the Purger.
aem_task
This is a fact table for each run or instance of a Replicate task.
Column | Data Type | Description |
---|---|---|
task_id |
Integer |
The task run ID. |
target_database_id |
Integer |
The target endpoint ID. |
target_database_name |
Character varying(1024) |
Target endpoint name. |
endpoint_type_id |
Integer |
The endpoint type ID. |
retrieval_time |
Timestamp without time zone |
The time that the information about the task was retrieved. |
server_id |
Integer |
The Replicate server ID. |
task_name_id |
Integer |
The task name ID. |
task_state_id |
Integer |
The task state ID. |
task_stop_reason_id |
Integer |
The task stop Reason ID. |
task_profile_id |
Integer |
The task profile ID. |
cdc_evt_applied_insert_count |
Integer |
The number of INSERTs since the last retrieval time or since the task was started. |
cdc_evt_applied_update_count |
Integer |
The number of UPDATEs since the last retrieval time or since the task was started. |
cdc_evt_applied_delete_count |
Integer |
The number of DELETEs since the last retrieval time or since the task was started. |
cdc_evt_applied_ddl_count |
Integer |
The number of DDLs since the last retrieval time or since the task was started. |
full_load_tables_completed_count |
Integer |
The number of completed tables loaded to the Target at retrieval time. |
full_load_tables_loading_count |
Integer |
The number of tables being loaded to the target at retrieval time. |
full_load_tables_queued_count |
Integer |
The number of tables waiting to be loaded to the target at retrieval time. |
full_load_tables_with_error_count |
Integer |
The number of tables that could not be loaded to the target at retrieval time due to an error. |
full_load_total_records_transferred |
Integer |
The total number of records that have completed loading to the target at retrieval time. |
full_load_est_records_count_for_all_tables |
Integer |
The estimated number of records to be loaded to the target. |
full_load_completed |
Integer |
Indicates if the Full Load has completed. |
full_load_start |
Timestamp without time zone |
The start time of the Full Load. |
full_load_finish |
Timestamp without time zone |
The finish time of the Full Load. |
full_load_thrput_src_thrput_records_count |
Integer |
The Full Load source throughput (in rec/sec) at retrieval time. |
full_load_thrput_src_thrput_volume |
Integer |
The Full Load source throughput (in kb/sec) at retrieval time. |
full_load_thrput_trg_thrput_records_count |
Integer |
The Full Load target throughput (in rec/sec) at retrieval time. |
full_load_thrput_trg_thrput_volume |
Integer |
The Full Load target throughput (in kb/sec) at retrieval time. |
cdc_thrput_src_thrput_records_count |
Integer |
The Change Processing source throughput (in rec/sec) at retrieval time. |
cdc_thrput_src_thrput_volume |
Integer |
The Change Processing source throughput (in kb/sec) at retrieval time. |
cdc_thrput_trg_thrput_records_count |
Integer |
The Change Processing target throughput (in rec/sec) at retrieval time. |
cdc_thrput_trg_thrput_volume |
Integer |
The Change Processing target throughput (in kb/sec) at retrieval time |
cdc_trans_read_rollback_count |
Integer |
The number of ROLLBACK transactions since the last retrieval time or since the task was started. |
cdc_trans_read_records_rollback_count |
Integer |
The number of ROLLBACK change records since the last retrieval time or since the task was started. |
cdc_trans_rollback_change_volume |
Integer |
The volume of ROLLBACK changes (in bytes). |
cdc_trans_applied_transactions_in_progress_count |
Integer |
The number of transactions in progress at retrieval time. |
cdc_trans_applied_records_in_progress_count |
Integer |
The number of records for all transactions in progress at retrieval time. |
cdc_trans_applied_comitted_transaction_count |
Integer |
The number of transactions committed since the last retrieval time or since the task was started. |
cdc_trans_applied_records_comitted_count |
Integer |
The number of records for all committed transactions since the last retrieval time or since the task was started. |
cdc_trans_applied_volume_comitted |
Integer |
The volume of change for all committed transactions (in bytes) since the last retrieval time or since the task was started. |
cdc_trans_read_memory_events_count |
Integer |
The number of changes accumulated in memory until source commit at retrieval time. |
cdc_trans_read_swapped_events_count |
Integer |
The number of changes accumulated on disk until source commit at retrieval time. |
cdc_trans_applied_memory_events_count |
Integer |
The number of changes in memory during apply and until target commit at retrieval time |
cdc_trans_applied_swap_events_count |
Integer |
The number of changes on disk during apply and until target commit at retrieval time. |
cdc_source_latency |
Integer |
The average time gap between the original change in the source endpoint and capturing it (in seconds) at retrieval time. |
cdc_apply_latency |
Integer |
The overall latency (in seconds) at retrieval time. |
memory_usage_kb |
Integer |
The memory usage for the task (in kilobytes) at retrieval time. |
disk_usage_kb |
Integer |
The utilization of disk space for the task (in kilobytes) at retrieval time. |
cpu_percentage |
Integer |
The CPU consumption of the task (as a percentage of server CPU with a value of 0-100) at retrieval time. |
data_error_count |
Integer |
The total number of data errors at retrieval time for all tables involved in a task. |
task_option_full_load_enabled |
Integer |
Indicates if Full Load is enabled. |
task_option_apply_changes_enabled |
Integer |
Indicates if Apply Changes is enabled. |
task_option_store_changes_enabled |
Integer |
Indicates if Store Changes is enabled. |
task_option_audit_changes_enabled |
Integer |
Indicates if Audit Changes is enabled. |
task_option_recovery_enabled |
Integer |
Indicates if Recovery is enabled. |
server_cpu_percentage |
Integer |
The CPU percentage being utilized by the Replicate server process running on the server machine at retrieval time. |
machine_cpu_percentage |
Integer |
The total CPU percentage being utilized by all processes (i.e. not just Qlik processes) running on the server machine at retrieval time. |
tasks_cpu_percentage |
Integer |
The total CPU percentage being utilized by all Replicate task processes running on the server machine at retrieval time. |
aem_task_name
The aem_task_name table provides lookup information about the task name.
Column | Data Type | Description |
---|---|---|
server_id |
Integer |
The Replicate server ID. |
task_name_id |
Integer |
The task name ID. |
task_name |
Character varying(1024) |
The task name. |
aem_task_previous_metrics
Used internally to calculate differences between the current run of a task and the previous run.
aem_task_profile
The aem_task_profile table provides lookup information about the task profile.
Column | Data Type | Description |
---|---|---|
task_profile_id |
Integer |
The task profile ID. |
name |
Character varying(1024) |
The task profile name. |
description |
Character varying(1024) |
The task profile description. |
aem_task_state
The aem_task_state table provides lookup information about the task state.
Column | Data Type | Description |
---|---|---|
task_state_id |
Integer |
The task state ID. |
name |
Character varying(1024) |
The task state dame. |
description |
Character varying(1024) |
The task state description. |
aem_task_stop_reason
The aem_task_stop_reason table provides lookup information about the reason that a task stopped.
Column | Data Type | Description |
---|---|---|
task_stop_reason_id |
Integer |
The stop reason ID. |
name |
Character varying(1024) |
The stop reason name. |
description |
Character varying(1024) |
The stop reason description. |
Sample Queries
The following sample queries demonstrate how useful information can be extracted from the schema.
Query 1:
Provides a sampling of Full Load and Change Processing metrics for all tasks with a source endpoint named "Teradata DB" for the month of August 2017.
SELECT AEM_SERVER.SERVER_NAME AS "Server" ,AEM_TASK_NAME.TASK_NAME AS "Task" ,AVG(AEM_TASK.MEMORY_USAGE_KB / 1024::float) AS "Avg Memory" ,AVG(AEM_TASK.DISK_USAGE_KB) / 1024::float AS "Avg IO" ,MAX((CASE WHEN AEM_TASK.FULL_LOAD_TABLES_COMPLETED_COUNT IS NULL THEN 0 ELSE AEM_TASK.FULL_LOAD_TABLES_COMPLETED_COUNT END)+(CASE WHEN AEM_TASK.FULL_LOAD_TABLES_LOADING_COUNT IS NULL THEN 0 ELSE AEM_TASK.FULL_LOAD_TABLES_LOADING_COUNT END)+(CASE WHEN AEM_TASK.FULL_LOAD_TABLES_QUEUED_COUNT IS NULL THEN 0 ELSE AEM_TASK.FULL_LOAD_TABLES_QUEUED_COUNT END)+(CASE WHEN AEM_TASK.FULL_LOAD_TABLES_WITH_ERROR_COUNT IS NULL THEN 0 ELSE AEM_TASK.FULL_LOAD_TABLES_WITH_ERROR_COUNT END)) AS "Full Load Total Tables" ,MAX(FULL_LOAD_TOTAL_RECORDS_TRANSFERRED) AS "Full Load Total Records" ,MAX(AEM_TASK.FULL_LOAD_FINISH-AEM_TASK.FULL_LOAD_START) AS "Full Load Max Load Duration" ,AVG(AEM_TASK.FULL_LOAD_FINISH-AEM_TASK.FULL_LOAD_START) AS "Full Load Avg Load Duration" ,AVG((CASE WHEN AEM_TASK.FULL_LOAD_THRPUT_TRG_THRPUT_RECORDS_COUNT IS NULL THEN 0 ELSE AEM_TASK.FULL_LOAD_THRPUT_TRG_THRPUT_RECORDS_COUNT END)) AS "Full Load Avg Target Throughput Records per Second" ,SUM((CASE WHEN AEM_TASK.CDC_EVT_APPLIED_INSERT_COUNT IS NULL THEN 0 ELSE AEM_TASK.CDC_EVT_APPLIED_INSERT_COUNT END)+(CASE WHEN AEM_TASK.CDC_EVT_APPLIED_UPDATE_COUNT IS NULL THEN 0 ELSE AEM_TASK.CDC_EVT_APPLIED_UPDATE_COUNT END)+(CASE WHEN AEM_TASK.CDC_EVT_APPLIED_DELETE_COUNT IS NULL THEN 0 ELSE AEM_TASK.CDC_EVT_APPLIED_DELETE_COUNT END)+(CASE WHEN AEM_TASK.CDC_EVT_APPLIED_DDL_COUNT IS NULL THEN 0 ELSE AEM_TASK.CDC_EVT_APPLIED_DDL_COUNT END)) AS "CDC Total Applied Changes" ,SUM((CASE WHEN AEM_TASK.CDC_TRANS_APPLIED_COMITTED_TRANSACTION_COUNT IS NULL THEN 0 ELSE AEM_TASK.CDC_TRANS_APPLIED_COMITTED_TRANSACTION_COUNT END)) AS "CDC Total Applied Transactions" ,AVG((CASE WHEN AEM_TASK.CDC_THRPUT_TRG_THRPUT_RECORDS_COUNT IS NULL THEN 0 ELSE AEM_TASK.CDC_THRPUT_TRG_THRPUT_RECORDS_COUNT END)) AS "CDC Avg Target Throughput Records per Second" ,AVG((CASE WHEN AEM_TASK.CDC_APPLY_LATENCY IS NULL THEN 0 ELSE AEM_TASK.CDC_APPLY_LATENCY END)) AS "CDC Avg Apply Latency" FROM AEM_TASK INNER JOIN AEM_SERVER ON (AEM_TASK.SERVER_ID = AEM_SERVER.SERVER_ID) INNER JOIN AEM_TASK_NAME ON (AEM_TASK.TASK_NAME_ID = AEM_TASK_NAME.TASK_NAME_ID) INNER JOIN AEM_SOURCE_DATABASE ON (AEM_TASK.ID = AEM_SOURCE_DATABASE.TASK_ID) WHERE (AEM_TASK.RETRIEVAL_TIME >= TIMESTAMP WITH TIME ZONE '2017-08-01 00:00:00.00000+00') AND (AEM_TASK.RETRIEVAL_TIME < TIMESTAMP WITH TIME ZONE '2017-09-01 00:00:00.00000+00') AND (AEM_SOURCE_DATABASE.SOURCE_DATABASE_NAME = 'Teradata DB') GROUP BY AEM_SERVER.SERVER_NAME ,AEM_TASK_NAME.TASK_NAME
Query 2:
Provides a trend throughout the day of the number of Completed, Queued, Loading and Error tables related to Full Load for the month of August 2017 and for a Replicate server named "rep-server1-prod".
SELECT SUM((CASE WHEN AEM_TASK.FULL_LOAD_TABLES_COMPLETED_COUNT IS NULL THEN 0 ELSE AEM_TASK.FULL_LOAD_TABLES_COMPLETED_COUNT END)) AS "Full Load Tables Completed" ,SUM((CASE WHEN AEM_TASK.FULL_LOAD_TABLES_LOADING_COUNT IS NULL THEN 0 ELSE AEM_TASK.FULL_LOAD_TABLES_LOADING_COUNT END)) AS "Full Load Tables Loading" ,SUM((CASE WHEN AEM_TASK.FULL_LOAD_TABLES_QUEUED_COUNT IS NULL THEN 0 ELSE AEM_TASK.FULL_LOAD_TABLES_QUEUED_COUNT END)) AS "Full Load Tables Queued" ,SUM((CASE WHEN AEM_TASK.FULL_LOAD_TABLES_WITH_ERROR_COUNT IS NULL THEN 0 ELSE AEM_TASK.FULL_LOAD_TABLES_WITH_ERROR_COUNT END)) AS "Full Load Tables Error" ,AEM_TASK.RETRIEVAL_TIME AS "Date and Time" FROM AEM_TASK INNER JOIN AEM_SERVER ON (AEM_TASK.SERVER_ID = AEM_SERVER.SERVER_ID) WHERE (AEM_TASK.RETRIEVAL_TIME >= TIMESTAMP WITH TIME ZONE '2017-08-01 00:00:00.00000+00') AND (AEM_TASK.RETRIEVAL_TIME < TIMESTAMP WITH TIME ZONE '2017-09-01 00:00:00.00000+00') AND (AEM_SERVER.SERVER_NAME IN ('rep-server1-prod')) GROUP BY AEM_TASK.RETRIEVAL_TIME ORDER BY "Date and Time" ASC
Query 3:
Provides an daily trend of Average Throughput and Latency metrics related to Change Processing for the month of August 2017 and for a Replicate server named "rep-server1-prod".
SELECT AVG((CASE WHEN AEM_TASK.CDC_THRPUT_SRC_THRPUT_RECORDS_COUNT IS NULL THEN 0 ELSE AEM_TASK.CDC_THRPUT_SRC_THRPUT_RECORDS_COUNT END)) AS "CDC Avg Source Throughput Records" ,AVG((CASE WHEN AEM_TASK.CDC_THRPUT_TRG_THRPUT_RECORDS_COUNT IS NULL THEN 0 ELSE AEM_TASK.CDC_THRPUT_TRG_THRPUT_RECORDS_COUNT END)) AS "CDC Avg Target Throughput Records" ,AVG((CASE WHEN AEM_TASK.CDC_SOURCE_LATENCY IS NULL THEN 0 ELSE AEM_TASK.CDC_SOURCE_LATENCY END)) AS "CDC Avg Source Latency" ,AVG((CASE WHEN AEM_TASK.CDC_APPLY_LATENCY IS NULL THEN 0 ELSE AEM_TASK.CDC_APPLY_LATENCY END)) AS "CDC Avg Apply Latency" ,EXTRACT(YEAR from AEM_TASK.RETRIEVAL_TIME) AS "Year" ,EXTRACT(MONTH from AEM_TASK.RETRIEVAL_TIME) AS "Month" ,EXTRACT(DAY from AEM_TASK.RETRIEVAL_TIME) AS "Day" FROM AEM_TASK INNER JOIN AEM_SERVER ON (AEM_TASK.SERVER_ID = AEM_SERVER.SERVER_ID) WHERE (AEM_TASK.RETRIEVAL_TIME >= TIMESTAMP WITH TIME ZONE '2017-08-01 00:00:00.00000+00') AND (AEM_TASK.RETRIEVAL_TIME < TIMESTAMP WITH TIME ZONE '2017-09-01 00:00:00.00000+00') AND (AEM_SERVER.SERVER_NAME IN ('rep-server1-prod')) GROUP BY EXTRACT(YEAR from AEM_TASK.RETRIEVAL_TIME) ,EXTRACT(MONTH from AEM_TASK.RETRIEVAL_TIME) ,EXTRACT(DAY from AEM_TASK.RETRIEVAL_TIME) ORDER BY "Year", "Month", "Day"