Appending data to the index cluster

You can append data from the data sources to the Qlik Associative Big Data Index cluster using the Data Append REST API. The deployment contains shell scripts to simplify start up of the needed services.

When you append data in Qlik Associative Big Data Index it is automatically detected by Qlik Sense Engine Service. This happens within five minutes after the new data is available to the QSL Manager. The new data is applied the next time you do a selection in Qlik Sense.

Setting QSL Manager data update frequency

If you want the QSL Manager to check for new data more frequently, you can edit the idxltUpdateCheckFreqMS setting in qsl_registry_config.json. The default value is 60000 milliseconds (one minute). Do the following:
  1. If QSL services are running, stop QSL services.
  2. Delete qsl_registry_config.json from /output/config/qsl_processor.
  3. Edit /dist/runtime/config/template/qsl_registry_config.json and change the idxltUpdateCheckFreqMS setting.

  4. Restart QSL services.
When QSL services are restarted, an updated copy of qsl_registry_config.json is generated in /output/config/qsl_processor.

Preparing to append data

Before you can perform incremental updates the following services need to be configured and running.

  • Index Maintenance Service

    The Index Maintenance Service is started along with other indexing services when start_indexing_env.sh is executed, and configured when you register the schema. You can check that it is running with the task_manager.sh script.

  • Data append REST gateway

    You also need to start the data append REST gateway before you can append data.

Starting the data append REST gateway.

The Qlik Associative Big Data Index REST API provides a gateway to access the public RPCs of the Index Maintenance Service. Before you can notify the indexing cluster that appended data exists, you need to start the gateway with the start_rest_service.sh script located in the /home/ubuntu/dist/runtime/scripts/rest_service folder.

The start_rest_service.sh script does not have any mandatory options. If you execute it without options, it will use the settings from the indexing configuration files. If you specify any options, they will override the configuration file setting.

Example: Basic call

./start_rest_service.sh
Short option Long option Description
-h --help Print help for the script
-b --clusterconfig
-i --indexingsettings Specify the file path of the indexing setting file. If not specified, default value is runtime/config/indexing_setting.json.
-u --useip
-k --killall Kill any existing gateway process before starting the service.
-t --tls Use secure (TLS) connection.
-a --acceptLicense

Accept the Qlik User License Agreement (QULA). Default value is read from environment variable ACCEPT_QULA.

-s --keyfile Key file path that is used when TLS is enabled. If not specified, file {output_root_folder}/config/keys/server.key will be created and used.
-f --certfile Certification file path when TLS is enabled. If not specified, file {output_root_folder}/config/keys/server.crt will be created and used.
  --local Start all services locally.

Updating the index with appended data

If you want to notify the indexing cluster that there are new parquet files available in the data source, you need to execute two Data Append REST API calls. In the examples here we use curl as a client.

We want to add a parquet file stored as /mnt/efs/<user>/data/factor_1/customer.table/customer.parquet/part-00001-af10af29-14b6-425b-8f0f-9ff241684652-c000.snappy.parquet, using a schema tpch1.

  1. Add the parquet file.

    curl -k -X POST -H "Content-Type: application/json" -d "{\"file_path\":\"/mnt/efs/<user>/data/factor_1/customer.table/customer.parquet/part-00001-af10af29-14b6-425b-8f0f-9ff241684652-c000.snappy.parquet\"}" "https://localhost:8080/v1/bdi/idxmaint/files/add"

    Repeat this call if you have more files available.

  2. Trigger an index update

    curl -k -X POST -H "Content-Type: application/json" -d "{\"schema_name\":\"tpch1\"}" "https://localhost:8080/v1/bdi/idxmaint/files/trigger"

You can add Data Append REST API endpoint calls like these in your data pipeline flow, or execute them manually.

For more information about Data Append REST API, see Qlik Associative Big Data Index REST API for index maintenance.

Did this information help you?

Thanks for letting us know. Is there anything you'd like to tell us about this topic?

Can you tell us why it did not help you and how we can improve it?