The join prefix joins the loaded
table with an existing named table or the last previously created data table.
The effect of joining data is to extend the target table by an additional set of fields or attributes, namely ones not already present in the target table. Any common field names between the source data set and the target table are used to work out how to associate the new incoming records. This is commonly referred to as a “natural join”. A Qlik join operation can lead to the resulting target table having more or fewer records than it started with, depending on the uniqueness of the join association and the type of join employed.
There are four types of joins:
Left join
Left joins are the most common join type. For example, if you have a transaction data set and would like to combine it with a reference data set, you would typically use a Left Join. You would load the transaction table first, then load the reference data set while joining it via a Left Join prefix onto the already loaded transaction table. A Left Join would keep all transactions as-is and add on the supplementary reference data fields where a match is found.
Inner join
When you have two data sets where you only care about any results where there is a matching association, consider using an Inner Join. This will eliminate all records from both the source data loaded and the target table if no match is found. As a result, this may leave your target table with fewer records than before the join operation took place.
Outer join
When you need to keep both the target records and all of the incoming records, use an Outer Join. Where no match is found, each set of records is still kept while the fields from the opposite side of the join will remain unpopulated (null).
If the type keyword is omitted, the default join type is an outer join.
Right join
This join type keeps all the records about to be loaded, while reducing the records in the table targeted by the join to only those records where there is an association match in the incoming records. This is a niche join type that is sometimes used as a means of trimming down an already pre-loaded table of records to a required subset.
Information noteIf there are no field names in common between the source and target of a join operation, the join will result in a cartesian product of all rows – this is called a “cross join”.
Syntax:
[inner | outer | left | right ]Join[ (tablename )
]( loadstatement | selectstatement )
Arguments
Argument
Description
tablename
The named table to be compared to the loaded table.
loadstatement or selectstatement
The LOAD or SELECT statement for the loaded table.
These topics may help you work with this function:
The Keep load prefix is similar to the Join prefix, but it does not combine the source and target datasets. Instead, it trims each dataset according to the type of operation adopted (inner, outer, left, or right).
Example 1 - Left join: Enriching a target table with a reference data set
Overview
Open the Data load editor and add the load script below to a new tab.
The load script contains:
A dataset representing change records, which is loaded into a table named Changes. It includes a Status ID key field.
A second dataset representing change statuses, which is loaded and combined with the original change records by joining it with a left Join load prefix.
This left join ensures that the change records remain intact while adding on status attributes where a match in the incoming status records is found based on a common Status ID.
Load script
Changes:
Load * inline [
Change ID Status ID Scheduled Start Date Scheduled End Date Business Impact
10030 4 19/01/2022 23/02/2022 None
10015 3 04/01/2022 15/02/2022 Low
10103 1 02/04/2022 29/05/2022 Medium
10185 2 23/06/2022 08/09/2022 None
10323 1 08/11/2022 26/11/2022 High
10326 2 11/11/2022 05/12/2022 None
10138 2 07/05/2022 03/08/2022 None
10031 3 20/01/2022 25/03/2022 Low
10040 1 29/01/2022 22/04/2022 None
10134 1 03/05/2022 08/07/2022 Low
10334 2 19/11/2022 06/02/2023 Low
10220 2 28/07/2022 06/09/2022 None
10264 1 10/09/2022 17/10/2022 Medium
10116 1 15/04/2022 24/04/2022 None
10187 2 25/06/2022 24/08/2022 Low
] (delimiter is '\t');
Status:
Left Join (Changes)
Load * inline [
Status ID Status Sub Status
1 Open Not Started
2 Open Started
3 Closed Completed
4 Closed Cancelled
5 Closed Obsolete
] (delimiter is '\t');
Results
Open the Data model viewer and note the shape of the data model. Only one denormalized table is present. It is a combination of all the original change records, with the matching status attributes joined onto each change record.
Resulting internal data model
Changes
Change ID
Status ID
Scheduled Start Date
Scheduled End Date
Business Impact
Status
Sub Status
If you expand the preview window in the Data model viewer, you will see a portion of this full result set organized into a table:
Preview of Changes table in the Data model viewer
Change ID
Status ID
Scheduled Start Date
Scheduled End Date
Business Impact
Status
Sub Status
10030
4
19/01/2022
23/02/2022
None
Closed
Cancelled
10031
3
20/01/2022
25/03/2022
Low
Closed
Completed
10015
3
04/01/2022
15/02/2022
Low
Closed
Completed
10103
1
02/04/2022
29/05/2022
Medium
Open
Not Started
10116
1
15/04/2022
24/04/2022
None
Open
Not Started
10134
1
03/05/2022
08/07/2022
Low
Open
Not Started
10264
1
10/09/2022
17/10/2022
Medium
Open
Not Started
10040
1
29/01/2022
22/04/2022
None
Open
Not Started
10323
1
08/11/2022
26/11/2022
High
Open
Not Started
10187
2
25/06/2022
24/08/2022
Low
Open
Started
10185
2
23/06/2022
08/09/2022
None
Open
Started
10220
2
28/07/2022
06/09/2022
None
Open
Started
10326
2
11/11/2022
05/12/2022
None
Open
Started
10138
2
07/05/2022
03/08/2022
None
Open
Started
10334
2
19/11/2022
06/02/2023
Low
Open
Started
Since the fifth row in the Status table (Status ID: '5', Status: 'Closed', Sub Status: 'Obsolete') does not correspond to any of the records in the Changes table, the information in this row does not appear in the result set above.
Return to the Data load editor. Load the data and open a sheet. Create a new table and add this field as a dimension: Status.
Now you can inspect the number of Changes by Status.
Results table
Status
=Count([Change ID])
Open
12
Closed
3
Example 2 – Inner join: Combining matching records only
Overview
Open the Data load editor and add the load script below to a new tab.
The load script contains:
A dataset representing change records, which is loaded into a table named Changes.
A second dataset representing change records originating from the source system JIRA.This is loaded and combined with the original records by joining it with an Inner Join load prefix.
This Inner Join ensures that only the five change records which are found in both datasets are kept.
Load script
Changes:
Load * inline [
Change ID Status ID Scheduled Start Date Scheduled End Date Business Impact
10030 4 19/01/2022 23/02/2022 None
10015 3 04/01/2022 15/02/2022 Low
10103 1 02/04/2022 29/05/2022 Medium
10185 2 23/06/2022 08/09/2022 None
10323 1 08/11/2022 26/11/2022 High
10326 2 11/11/2022 05/12/2022 None
10138 2 07/05/2022 03/08/2022 None
10031 3 20/01/2022 25/03/2022 Low
10040 1 29/01/2022 22/04/2022 None
10134 1 03/05/2022 08/07/2022 Low
10334 2 19/11/2022 06/02/2023 Low
10220 2 28/07/2022 06/09/2022 None
10264 1 10/09/2022 17/10/2022 Medium
10116 1 15/04/2022 24/04/2022 None
10187 2 25/06/2022 24/08/2022 Low
] (delimiter is '\t');
JIRA_changes:
Inner Join (Changes)
Load
[Ticket ID] AS [Change ID],
[Source System]
inline
[
Ticket ID Source System
10000 JIRA
10030 JIRA
10323 JIRA
10134 JIRA
10334 JIRA
10220 JIRA
20000 TFS
] (delimiter is '\t');
Results
Load the data and open a sheet. Create a new table and add these fields as dimensions:
Source System
Change ID
Business Impact
Now you can inspect the five resulting records. The resultant table from an Inner Join will only include records with matching information in both datasets.
Results table
Source System
Change ID
Business Impact
JIRA
10030
None
JIRA
10134
Low
JIRA
10220
None
JIRA
10323
High
JIRA
10334
Low
Example 3 – Outer join: Combining overlapping record sets
Overview
Open the Data load editor and add the load script below to a new tab.
The load script contains:
A dataset representing change records, which is loaded into a table named Changes.
A second dataset representing change records originating from the source system JIRA, which is loaded and combined with the original records by joining it with an Outer Join load prefix.
This ensures that all the overlapping change records from both datasets are kept.
Load script
// 8 Change records
Changes:
Load * inline [
Change ID Status ID Scheduled Start Date Scheduled End Date Business Impact
10030 4 19/01/2022 23/02/2022 None
10015 3 04/01/2022 15/02/2022 Low
10138 2 07/05/2022 03/08/2022 None
10031 3 20/01/2022 25/03/2022 Low
10040 1 29/01/2022 22/04/2022 None
10134 1 03/05/2022 08/07/2022 Low
10334 2 19/11/2022 06/02/2023 Low
10220 2 28/07/2022 06/09/2022 None
] (delimiter is '\t');
// 6 Change records
JIRA_changes:
Outer Join (Changes)
Load
[Ticket ID] AS [Change ID],
[Source System]
inline
[
Ticket ID Source System
10030 JIRA
10323 JIRA
10134 JIRA
10334 JIRA
10220 JIRA
10597 JIRA
] (delimiter is '\t');
Results
Load the data and open a sheet. Create a new table and add these fields as dimensions:
Source System
Change ID
Business Impact
Now you can inspect the 10 resulting records.
Results table
Source System
Change ID
Business Impact
JIRA
10030
None
JIRA
10134
Low
JIRA
10220
None
JIRA
10323
-
JIRA
10334
Low
JIRA
10597
-
-
10015
Low
-
10031
Low
-
10040
None
-
10138
None
Example 4 – Right join: Trimming down a target table by a secondary master dataset
Overview
Open the Data load editor and add the load script below to a new tab.
The load script contains:
A dataset representing change records, which is loaded into a table named Changes.
A second dataset representing change records originating from the source system Teamwork. This is loaded and combined with the original records by joining it with a Right Join load prefix.
This ensures that only Teamwork change records are kept, while not losing any Teamwork records if the target table does not have a matching Change ID.
Load script
Changes:
Load * inline [
Change ID Status ID Scheduled Start Date Scheduled End Date Business Impact
10030 4 19/01/2022 23/02/2022 None
10015 3 04/01/2022 15/02/2022 Low
10103 1 02/04/2022 29/05/2022 Medium
10185 2 23/06/2022 08/09/2022 None
10323 1 08/11/2022 26/11/2022 High
10326 2 11/11/2022 05/12/2022 None
10138 2 07/05/2022 03/08/2022 None
10031 3 20/01/2022 25/03/2022 Low
10040 1 29/01/2022 22/04/2022 None
10134 1 03/05/2022 08/07/2022 Low
10334 2 19/11/2022 06/02/2023 Low
10220 2 28/07/2022 06/09/2022 None
10264 1 10/09/2022 17/10/2022 Medium
10116 1 15/04/2022 24/04/2022 None
10187 2 25/06/2022 24/08/2022 Low
] (delimiter is '\t');
Teamwork_changes:
Right Join (Changes)
Load
[Ticket ID] AS [Change ID],
[Source System]
inline
[
Ticket ID Source System
10040 Teamwork
10015 Teamwork
10103 Teamwork
10031 Teamwork
50231 Teamwork
] (delimiter is '\t');
Results
Load the data and open a sheet. Create a new table and add these fields as dimensions:
A field contains values, loaded from a data source. At a basic level, a field corresponds to a column in a table. Fields are used to create dimensions and measures in visualizations.
The term dataset is sometimes synonymous with table. It can refer to the original source table, the table after undergoing transformations, or the fact and dimension tables in a data mart.
It can also refer to a logical table, where there are several instance tables and views:
Current data
History, which holds previous versions of the table
A measure is a calculation base on one ore more aggregations. For example, the sum of sales is a single aggregation, while the sum of sales divided by the count of customers is a measure based on two aggregations.