Set analysis and set expressions
Set analysis offers a way of defining a set (or group) of data values that is different from the normal set defined by the current selections.
Normally, when you make a selection, aggregation functions, such as Sum, Max, Min, Avg, and Count aggregate over the selections that you have made: the current selections. Your selections automatically define the data set to aggregate over. With set analysis you can define a group that is independent of the current selections. This can be useful if you want to show a particular value, for example, the market share of a product across all regions, irrespective of the current selections.
Set analysis is also powerful when making different sorts of comparisons, such as what are the best-selling products compared with poorly-selling products, or this year against last year.
Let us imagine an example in which you start working in an app by selecting the year 2010 in a filter pane. The aggregations are then based on that selection, and the visualizations only show values for that year. When you make new selections, the visualizations are updated accordingly. The aggregations are made over the set of possible records defined by the current selections. With set analysis, you can define a set that is of interest to you and does not depend on the selections.
Creating set expressions
Before looking at the different parts of a set analysis example, there is a distinction that should be made between a set expression and set analysis:
Defining a set of field values is referred to as defining a set expression, whereas using set expressions to analyze data is referred to as set analysis. Consequently, the rest of this section focuses on the set expression and its components.
Here is a set analysis example: sum( {$<Year={2009}>} Sales ), in which {$<Year={2009}>} is a set expression.
There are two general syntax rules for a set expression:
- A set expression must be used in an aggregation function. In this example, the aggregation function is sum(Sales).
- A set expression must be enclosed by braces, {}. In the example, the set expression is: {$<Year={2009}>}.
A set expression consists of a combination of the following parts:
- Identifiers. One or more identifiers define the relationship between the set expression and what is being evaluated in the rest of the expression. A simple set expression consists of a single identifier, such as the dollar sign, {$}, which means all records in the current selection.
- Operators. If there is more than one identifier, an operator or operators are used to refine the set of data by specifying how the sets of data represented by the identifiers are combined to create a subset or superset, for example.
- Modifiers. A modifier or modifiers can be added to the set expression to change the selection. A modifier can be used on its own or to modify an identifier to filter the data set.
Examples:
Example 1:
This set expression contains an identifier $, and the modifier <Year={2009>}. This example does not include an operator. The set expression is interpreted as: "All records in the current selection that belong to the year 2009".
Example 2:
This set expression contains the identifiers $ and 1, the operator + and the modifiers <Year={2009}> and <Country={'Sweden'}>.
This set expression is designed to sum the sales for the year 2009 associated with the current selections and add the full set of data associated with the country Sweden across all years.
Identifiers, operators and modifiers are described in more detail with in the following subsections.
Identifiers
Identifiers define the relationship between the set expression and the field values or expression being evaluated.
In our example sum( {$<Year={2009}>} Sales ), the identifier is the dollar sign, $, and means that the set of records to be evaluated consists of all the records of the current selection. This set is then further filtered by the modifier part of the set expression. In a more complex set expression, two identifiers can be combined using an operator.
This table shows some common identifiers.
Identifier | Description |
---|---|
1 | Represents the full set of all the records in the application, irrespective of any selections made. |
$ | Represents the records of the current selection. The set expression {$} is thus the equivalent to not stating a set expression. |
$1 | Represents the previous selection. $2 represents the previous selection-but-one, and so on. |
$_1 | Represents the next (forward) selection. $_2 represents the next selection-but-one, and so on. |
BM01 | You can use any bookmark ID or bookmark name. |
MyAltState | You can reference the selections made in an alternate state by its state name. |
This table shows some examples with different identifiers.
Example | Result |
---|---|
sum ({1} Sales) | Returns total sales for the app, disregarding selections but not the dimension. |
sum ({$} Sales) |
Returns the sales for the current selection, that is, the same as sum(Sales). |
sum ({$1} Sales) |
Returns the sales for the previous selection. |
sum ({BM01} Sales) |
Returns the sales for the bookmark named BM01. |
Operators
Operators are used to include, exclude, or intersect parts of or whole data sets. All operators use sets as operands and return a set as result.
This table shows operators that can be used in set expressions.
Operator | Description |
---|---|
+ | Union. This binary operation returns a set consisting of the records that belong to any of the two set operands. |
- | Exclusion. This binary operation returns a set of the records that belong to the first but not the other of the two set operands. Also, when used as a unary operator, it returns the complement set. |
* | Intersection. This binary operation returns a set consisting of the records that belong to both of the two set operands. |
/ | Symmetric difference (XOR). This binary operation returns a set consisting of the records that belong to either, but not both of the two set operands. |
This table shows some examples with operators.
Examples | Results |
---|---|
sum( {1-$} Sales ) | Returns sales for everything excluded by the selection. |
sum( {$*BM01} Sales ) | Returns sales for the intersection between the selection and bookmark BM01. |
sum( {-($+BM01)} Sales ) |
Returns sales excluded by the selection and bookmark BM01. |
Sum({$<Year={2009}>+1<Country={'Sweden'}>} Sales) | Returns sales for the year 2009 associated with the current selections and add the full set of data associated with the country Sweden across all years. |
Modifiers
Modifiers are used to make additions or changes to a selection. Such modifications can be written in the set expression. A modifier consists of one or several field names, each followed by one or several selections that can be made in the field. Modifiers begin and end with angle brackets, <>.
A set modifier modifies the selection of the preceding set identifier. If no set identifier is referenced, the current selection state is implicit.
This table shows some examples with modifiers.
Example | Result |
---|---|
sum({$<OrderDate = DeliveryDate>} Sales) | Returns the sales for the current selection where OrderDate = DeliveryDate. |
sum({1<Region = {US}>} Sales) |
Returns the sales for region US, disregarding the current selection. |
sum({$<Region = >} Sales) |
Returns the sales for the selection, but with the selection in Region removed. |
sum({<Region = >} Sales) |
Returns the same as the example above. When the set to modify is omitted, $ is assumed. |
sum({$<Year={2000}, Region={“U*”}>} Sales) | Returns the sales for the current selection, but with new selections both in Year and in Region. |
See Set modifiers for a more detailed description of modifiers and how to use them in more complex set analysis.
Did this page help you?
If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!