Skip to main content Skip to complementary content

Automating a preparation run via API

In addition to the Run feature of the Talend Cloud Data Preparation application, it is possible to run preparations using API calls with little to no configuration.

Having this option means that you can also use Qlik Application Automation or any other third party tool to schedule and automate your preparation runs.

This example uses an existing preparation called customers_preparation, based on a dataset that contains customers data stored on a database. The preparation applies some formatting operations on the data, and has been run to a new dataset. You will now use API calls to easily run this preparation again and regularly clean incoming data. The following documentation will describe the four main steps to run a preparation: list preparations, launch a run, monitor the run, and retrieve the history.

Information noteTip: The run API automatically selects the run configuration corresponding to the latest successful run triggered manually by a user from Talend Cloud Data Preparation interface which can cause issues in case of automated API runs. When your preparation is ready for a run via API, it is recommended to duplicate it and add it to a dedicated folder, this way the manual runs will not disturb this preparation.

If you want to look at the documentation of the API endpoints used in this scenario, open the Swagger documentation page that corresponds to your environment and select Talend Data Preparation - Run API. For more information, see Accessing the Talend Data Preparation REST API documentation.

Before you begin

In order to run a preparation via API, you need to meet the following conditions:
  • You have access to the preparation as owner or via sharing.
  • You have manually launched the preparation in the Talend Cloud Data Preparation interface at least once.
  • The preparation destination is not Direct download.
  • You have access to the destination dataset as owner or via sharing.
  • The manual preparation run was successful.

Retrieve the preparation ID

The first step is to use an endpoint to list the compatible preparations, and retrieve the id of the preparation that you want to run. The name of the endpoint used for this step is List preparations.

Procedure

  1. Using the GET method, enter the following endpoint:
    https://<tdp_environment>/transform/preparations/automation/preparations

    In this example and the next ones, <tdp_environment> corresponds to the URL of your Talend Cloud Data Preparation instance. For more information on which URL to use depending on your data center, see URLs to Talend Cloud applications.

  2. Send the request.

    The response header looks like this:

    [
      {
        "id": "74604d94-c013-4a58-b3c6-00b0075a35f4",
        "name": "customers_preparation",
        "folder": "preparations"
      }
    ]

    The / in the folder field means that the preparation is located in the root folder of Talend Cloud Data Preparation. If it was in a different folder, it would look like this: /<folder_1>/<subfolder_1>.

Results

Copy the preparation id that is retrieved, 74604d94-c013-4a58-b3c6-00b0075a35f4 in this example. You will need it to launch the preparation in the next step. Alternatively, it is also possible to retrieve the preparation id from the URL of your opened preparation in Talend Cloud Data Preparation.

Run the preparation

Using the preparation id previously retrieved, you will call the endpoint used to actually run the preparation. The name of the endpoint used for this step is Run preparation.

Procedure

  1. Using the POST method, enter the following endpoint:
    https://<tdp_environment>/transform/preparations/automation/preparations/<preparation_id>/runs
  2. Send the request.

    The response header looks like this:

    {
      "id": "848df626-1389-40b9-a7ba-5719faf12e86"
    }

Results

The preparation run has been launched, and the id retrieved in this case is a "run id", and not a "preparation id". Copy the id value, 848df626-1389-40b9-a7ba-5719faf12e86 in this example, that you will need for the monitoring endpoint.

Monitor the preparation run

Now that the run has started, you can use a different endpoint to monitor its status using the run id retrieved during the previous step. The name of the endpoint used for this step is Get run.

Procedure

  1. Using the GET method, enter the following endpoint:
    https://<tdp_environment>/transform/preparations/automation/runs/<run_id>
  2. Send the request.

    The response header looks like this:

    {
      "id": "848df626-1389-40b9-a7ba-5719faf12e86",
      "preparationId": "74604d94-c013-4a58-b3c6-00b0075a35f4",
      "status": "FINISHED",
      "start": "2024-07-25T21:03:49.919Z",
      "duration": "PT41.278S"
    }

    The possible statuses for your run are the following:

    • QUEUEING
    • RUNNING
    • FINISHED
    • ERROR
    • NO_MORE_AVAILABLE_EXECUTOR
    • SEMANTIC_TYPES_UNAVAILABLE

Results

The response header shows in the status field that the preparation has finished running. If an error happens during the run, the response will also include a full log to help you identify the cause of the error.

Get the run history

The preparation has been launched at least once using the API, which means that you can now consult the run history for this specific preparation using the preparation id. The name of the endpoint used for this step is Get run history.

Procedure

  1. Using the GET method, enter the following endpoint:
    https://<tdp_environment>/transform/preparations/automation/preparations/<preparation_id>/runs
  2. Send the request

    The response header looks like this:

    [
      {
        "id": "848df626-1389-40b9-a7ba-5719faf12e86",
        "preparationId": "74604d94-c013-4a58-b3c6-00b0075a35f4",
        "status": "FINISHED",
        "start": "2024-07-25T21:03:49.919Z",
        "duration": "PT41.278S"
      }
    ]

Results

You can see a summary of the last runs and their statuses. The response only shows one run in this example, but will include more when you have run a preparation several times.

The history retrieved in the response only shows runs launched using the API. On the contrary, the Run history from the Talend Cloud Data Preparation interface only shows runs launched manually in the application.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!