Publishing a message to an Apache Pulsar topic
This scenario aims at helping you set up and use connectors in a pipeline. You are advised to adapt it to your environment and use case.
Procedure
- Click Connections > Add connection.
-
In the panel that opens, select the type of connection you
want to create.
Example
data generator -
Select your engine
in the Engine list.
Information noteNote:
- It is recommended to use the Remote Engine Gen2 rather than the Cloud Engine for Design for advanced processing of data.
- If no Remote Engine Gen2 has been created from Talend Management Console or if it exists but appears as unavailable which means it is not up and running, you will not be able to select a Connection type in the list nor to save the new connection.
- The list of available connection types depends on the engine you have selected.
-
Select the type of connection you want to create.
Here, select Data generator.
- Click Add dataset and fill in the dataset properties as described in Data generator properties.
-
In the Add a new dataset panel, name your dataset.
Example
customer generated data -
Fill in the properties to generate the test customer data of your choice. In this
example:
- In the Rows field, type in 100 as you want to generate 100 test records.
- Click Add field, type in firstname in the Name field of the element, select First Name in the Type list and type in 0 in the Blank % field as you want to generate random first names with no empty fields.
- Click Add field, type in lastname in the Name field of the element, select Last Name in the Type list and type in 0 in the Blank % field as you want to generate random last names with no empty fields.
- Click Add field, type in age in the Name field of the element, select Age in the Type list, type in 18 in the Min field and 99 in the Max field and type in 0 in the Blank % field, as you want to generate ages between 18 and 99 with no empty fields.
- Click Connections > Add connection.
-
Select the type of connection you want to create.
Here, select Apache Pulsar.
- Fill in the connection properties to safely access your Apache Pulsar broker as described in Apache Pulsar properties, check the connection and click Add dataset.
-
In the Add a new dataset panel, name your dataset. In this
example, the customer-age topic that is currently empty will
be used to publish the data about processed customer information.
- Name your dataset, Customers on Pulsar for example.
- Click Validate to save your dataset.
- Click Add pipeline on the Pipelines page. Your new pipeline opens.
-
Give the pipeline a meaningful name.
Example
From Data generator to Pulsar - publish msg to Pulsar - Click ADD SOURCE and select your source dataset, customer generated data in the panel that opens.
- Click and add a Type converter processor to the pipeline in order to change the data type of the age field and be able to perform calculations on the field values. The configuration panel opens.
-
Give a meaningful name to the processor.
Example
convert age data type -
In the Converters area:
- Select .age in the Field path list as you want to change the data type of the values of these specific records.
- Select Double in the Output type list as you want to change to the data type from Integer to Double.
- Click Save to save your configuration.
-
(Optional) Look at the preview of the processor to see the data after the type
conversion.
- Click and add an Aggregate processor to the pipeline in order to calculate the average age of customers. The configuration panel opens.
-
Give a meaningful name to the processor.
Example
calculate average age -
In the Operations area:
- Select .age in the Field path list as you want to calculate the average value of these specific records.
- Select Average in the Operation list.
- Enter avg_age in the Output field name field as you want to rename the new generated field.
- Click Save to save your configuration.
-
(Optional) Look at the preview of the processor to see the data after the
aggregation operation.
- Click the ADD DESTINATION item on the pipeline to open the panel allowing to select the Apache Pulsar topic in which your output data will be loaded, Customers on Pulsar.
-
In the Configuration tab of the destination, check the
Producer name and select the topic in which the data will
be loaded.
- On the top toolbar of Talend Cloud Pipeline Designer, click the Run button to open the panel allowing you to select your run profile.
- Select your run profile in the list (for more information, see Run profiles), then click Run to run your pipeline.
Results
Your pipeline is being executed, the average age data from your local data has been processed and the output flow is sent to the Apache Pulsar topic you have defined.
What to do next
Once the event is published, you can consume the Pulsar message in another pipeline and use it as a source dataset:
Did this page help you?
If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!