Using context variables to use different datasets at execution time
In this scenario context variables are added to override both datasets that are used as source and destination at execution time.
Before you begin
-
You have previously created a connection to the system storing your source data, here an HTTP Client connection.
The Base URL of the connection is: https://datausa.io/
-
You have previously added the dataset holding your source data.
Here, United States public data including population statistics.
The HTTP Client dataset properties are:- HTTP method: GET
- Path: /api/data
- Query parameters: Name: drilldowns Value: Nation; Name: measures Value: Population
- Response body format: JSON
- Extract a sub-part of the response: .data
- Returned content: Body
- You also have created the destination connection, here a Google BigQuery connection and a BigQuery dataset named Nation_statistics. This BigQuery table will be created at execution time and will contain US statistics per year.
Procedure
Results
Your pipeline is being executed, the data is filtered and corresponds to the
context variable you have assigned to the source and destination datasets:
- In the pipeline execution logs, you can see the context variables used to retrieve US State data and create the State table on BigQuery at execution time. 312 records are inserted into the new table.
- In your Google BigQuery account, you can see the newly created State_statistics table that is filled with the filtered data (only State data collected after 2015).