Skip to main content Skip to complementary content

Combining French air operators with French sales

A complex pipeline including three source datasets and two Join processors.

Before you begin

  • You have previously created a connection to the system storing your source data.

    Here, a Test connection and an Amazon S3 connection.

  • You have previously added the dataset holding your source data.

    Here the first Left dataset holds aircraft data like operators, latitudes, longitudes, and the first Right dataset holds the airline data to be combined with the source data: operators and countries.

    The second Left dataset holds sales data like countries, regions, shipping dates, and the second Right dataset holds the result of the first join operation. The data to be combined is the country data.

  • You also have created the connection and the related dataset that will hold the processed data.

    Here, a Test connection.

Procedure

  1. Click Add pipeline on the Pipelines page. Your new pipeline opens.
  2. Give the pipeline a meaningful name.

    Example

    Join air operator and country data
  3. Click ADD SOURCE to open the panel allowing you to select your Left data, here a list of aircrafts.

    Example

    Preview of a data sample about aircrafts.
  4. Select your dataset and click Select in order to add it to the pipeline.
    Rename it if needed.
  5. Click Plus and add a Join processor to the pipeline. Another placeholder source appears on the canvas.
  6. Click ADD SOURCE to select your Right dataset, here a list of airlines with operator and country data.

    Example

    Preview of a data sample about airlines.
  7. Open the configuration panel of the Join processor.
  8. Give a meaningful name to the processor.

    Example

    join operators
  9. Select Inner join in the Join type list, as you want the matching records from the left and right datasets to be listed in the result set.
  10. In the Conditions area:
    1. Select or enter the path to the existing record to be compared in the left dataset. (here, .Op) in the Left key list.
    2. Select or enter the path to the existing record to be compared in the right dataset. (here, .Op) in the Right key list.

      You can use the avpath syntax in this area.

  11. Click Save to save your configuration.

    Look at the preview of the processor to compare your data before and after the join operation.

    Preview of the Join processor after applying the inner join operation.
  12. Click Plus and add a Filter processor to the pipeline. The configuration panel opens.
  13. Give a meaningful name to the processor.

    Example

    filter on FR operators
  14. In the Filters area:
    1. Select .Country in the Input list, as you want to filter operators based on this value.
    2. Select None in the Optionally select a function to apply list, as you do not want to apply a function while filtering records.
    3. Select == in the Operator list and type in France in the Value list as you want to filter on operators from France.
  15. Click Save to save your configuration.

    Look at the preview of the processor to compare your data before and after the filter operation.

    Preview of the Filter processor after applying a filter on French operators.
  16. Click Plus and add a Join processor to the pipeline. Another placeholder source appears on the canvas.
  17. Click ADD SOURCE to select the dataset to be combined with the existing one, here a list of sales with shipping data.

    Example

    Preview of a data sample about regional sales.
  18. Open the configuration panel of the Join processor.
  19. Give a meaningful name to the processor.

    Example

    join countries
  20. Select Inner join in the Join type list, as you want to matching records from the left and right datasets to be listed in the result set.
  21. In the Conditions area:
    1. Select or enter the path to the existing record to be compared in the left dataset. (here, .Country) in the Left key list.
    2. Select or enter the path to the existing record to be compared in the right dataset. (here, .Country) in the Right key list.

      You can use the avpath syntax in this area.

  22. Click Save to save your configuration.

    Look at the preview of the processor to compare your data before and after the join operation.

    Preview of the Join processor after applying the inner join operation.
  23. Click the ADD DESTINATION item next to the Join processor and select the dataset that will hold your joined data.
    Here a Test output dataset is added with the Log records to STDOUT option enabled.
  24. On the top toolbar of Talend Cloud Pipeline Designer, click the Run button to open the panel allowing you to select your run profile.
  25. Select your run profile in the list (for more information, see Run profiles), then click Run to run your pipeline.

Results

Your pipeline is being executed, the French operator data is combined with the French sales data in the generated output. You can look at the logs to see the records generated after the join operations:
Pipeline logs showing the generated records after the join operations.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!