Skip to main content Skip to complementary content

Processing an Azure Synapse table and loading it into Azure Blob Storage

This scenario aims at helping you set up and use connectors in a pipeline. You are advised to adapt it to your environment and use case.

Example of a pipeline created from the instructions below.

Procedure

  1. Click Connections > Add connection.
  2. In the panel that opens, select the type of connection you want to create.

    Example

    Synapse
  3. Select your engine in the Engine list.
    Information noteNote:
    • It is recommended to use the Remote Engine Gen2 rather than the Cloud Engine for Design for advanced processing of data.
    • If no Remote Engine Gen2 has been created from Talend Management Console or if it exists but appears as unavailable which means it is not up and running, you will not be able to select a Connection type in the list nor to save the new connection.
    • The list of available connection types depends on the engine you have selected.
  4. Select the type of connection you want to create.
    Here, select Database.
  5. Fill in the connection properties to access your Azure Synapse database as described in Azure Synapse properties, check the connection and click Add dataset.
  6. In the Add a new dataset panel, name your dataset. In this example, the table contains data about taxi location.

    Example

    Azure Synapse geography table
  7. Fill in the required properties to access the table located in your database and click View sample to see a preview of your dataset sample.
    Configuration of a new Azure Synapse dataset.
  8. Click Validate to save your dataset.
  9. Do the same to add the Azure Blob container that will be used as a destination in your pipeline. Fill in the connection properties as described in Azure Blob Storage properties.
    Configuration of a new Azure Blob dataset.
    In this example, a CSV file containing data about taxi location located in the talend dir folder of an Azure Blob container called talend-blob will be used as the pipeline destination. You are able to see your container directories from the Storage Explorer page of your Azure Storage Account.
    The CSV file in the Storage Explorer page.
  10. Click Add pipeline on the Pipelines page. Your new pipeline opens.
  11. Give the pipeline a meaningful name.

    Example

    From Azure Synapse table to Azure Blob - load table
  12. Click ADD SOURCE and select your source dataset, Azure Synapse geography table in the panel that opens.
  13. Click add processor to add processors to the pipeline, for example a Field selector to select specific fields and give them a meaningful name or an Aggregate processor to list and group the records.
  14. Click the ADD DESTINATION item on the pipeline to open the panel allowing to select the Azure Blob file in which your output data will be loaded.
  15. Give a meaningful name to the Destination; load in Azure Blob Storage for example.
  16. In the Configuration tab of the destination, click Advanced and type in a prefix for the Blob name to be created when executing the pipeline.
  17. Click Save to save your configuration.
  18. (Optional) Click the last processor to preview the processed data.
  19. On the top toolbar of Talend Cloud Pipeline Designer, click the Run button to open the panel allowing you to select your run profile.
  20. Select your run profile in the list (for more information, see Run profiles), then click Run to run your pipeline.

Results

Your pipeline is being executed, the taxi location information that was stored on Azure Synapse has been aggregated per cities and the output flow is sent to the Azure Blob target file you have defined.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!