Skip to main content Skip to complementary content

Filling empty records with a fixed value

A pipeline with a Test source, a Data cleansing processor, and a Test destination.

Before you begin

  • You have previously created a connection to the system storing your source data.

    Here, a Test connection.

  • You have previously added the dataset holding your source data.

    Download and extract the file: type_converter-datacleansing-taxi.zip. It contains hierarchical taxi data including pickup time, dropoff time, fare, etc.

  • You also have created the connection and the related dataset that will hold the processed data.

    Here, a Test dataset.

Procedure

  1. Click Add pipeline on the Pipelines page. Your new pipeline opens.
  2. Give the pipeline a meaningful name.

    Example

    Fill empty cells with appropriate value
  3. Click ADD SOURCE to open the panel allowing you to select your source data. Here it is taxi-related data that contains a column with empty records (.store_and_fwd_flag).

    Example

    Preview of a data sample about taxi data.
  4. Select your dataset and click Select in order to add it to the pipeline.
    Rename it if needed.
  5. Click Plus and add a Data cleansing processor to the pipeline. The configuration panel opens.
  6. Give a meaningful name to the processor.

    Example

    fill empty cells with N/A value
  7. In the Configuration area:
    1. Select Fill cells with value in the Function name list as you want to add the tax amount to the price of the purchase.
    2. Select .store_and_fwd_flag in the Fields to process list, as it corresponds to the field with empty records.
    3. Select Value in the Use with list and enter N/A in the Value field as you want to replace all empty records with the value N/A (non available).
  8. Click Save to save your configuration.

    Look at the preview of the processor to compare your data before and after the cleansing operation.

    Preview of the Data cleansing processor after replacing empty records with N/A text.
  9. Click ADD DESTINATION and select the dataset that will hold your cleansed data.
    Rename it if needed.
  10. On the top toolbar of Talend Cloud Pipeline Designer, click the Run button to open the panel allowing you to select your run profile.
  11. Select your run profile in the list (for more information, see Run profiles), then click Run to run your pipeline.

Results

Your pipeline is being executed, the empty records are replaced with the fixed value you have indicated and the output flow is sent to the target system you have indicated.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!