Optimizing application performance
Application performance can be improved with reduced application size, simplified data models, and strategic use of set analysis. This section will help you avoid performance issues by pointing out areas where performance can be impacted and how you can evaluate and monitor application performance.
Application complexity
These are loose categories that can help diagnose issues. The most complex applications have the lowest performance.
Simple applications:
- Do not include complex set analysis or If() statements.
- Do not include large tables.
- Have a simple data model.
- Contain simple calculations.
- May have large data volumes.
Moderate applications:
- Have a data model with many tables, but follow best practices.
- Use set analysis and several If() statements.
- Have large or wide tables on sheets (15 columns or more).
Complex applications:
-
Have a very complex data model.
- Connect to large data volumes.
- Contain complex calculations, charts, and tables.
Application details
You need to consider your hardware environment in relation to application size, because it affects the performance of your Qlik Sense deployment. For example, if you do not optimize your applications, they may require more hardware resources.
Monitoring application size will help you:
- Understand current performance.
- Understand the performance impact of deploying a new application.
- Understand the performance impact of modifying an existing application.
- Resolve performance issues.
- Plan for future growth.
Qlik provides tools that can help you asses your applications. For more information, see: Performance and scalability in Qlik Sense Enterprise.
These are the basic application elements that can affect performance:
| Feature | Description |
|---|---|
| Application disk size (MB) | You can find application size in the QMC. Go to Apps, and open the Column selector on the right hand side next to Actions. Click the radio button next to File size (MB). If you are using Qlik Sense Desktop, you can find application size in Windows Explorer. The default folder is %USERPROFILE%\Documents\Qlik\Sense\Apps. The Apps folder lists all application names and file sizes. |
|
Application size in RAM (GB) |
You can determine an application's base RAM footprint by:
You can use the App Metadata Analyzer to find this metric if you are using Qlik Sense June 2018 or later. For more information, see Application Metadata Analyzer. |
|
Application total rows (M) |
You can use system fields to calculate total rows. Create a KPI with the measure Sum($Rows). For more information, see System fields. |
| Application total fields | You can use system fields to calculate total fields. Create a KPI with the measureSum($Fields). For more information, see System fields. |
| Application total tables | You can use system fields to calculate total tables. Create a KPI with the measure Count(DISTINCT $Table). For more information, see System fields. |
Monitoring your application
The Qlik Management Console (QMC) provides applications for monitoring system performance and usage on Qlik Sense Enterprise on Windows:
-
The Operations Monitor application provides information about hardware utilization, such as server memory and CPU usage, active users, and reload task activity. It also provides summary and detailed information about errors, warnings, and log activities in the Qlik Sense server environment.
-
The License Monitor application tracks license usage, and facilitates monitoring changes to license allocation.
- The Log Monitor application presents nearly all log data available and enables trend analysis and troubleshooting.
- The Sessions Monitor application shows log data about usage of applications.
- The Reloads Monitor application presents detailed information about reload data, both from the QMC and applications open in the hub.
- The Sense System Performance Analyzer application displays Qlik Sense performance across all nodes.
- The Sense Connector Logs Analyzer application provides insights into usage and errors of specific Qlik connectors.
- The App Metadata Analyzer application provides a holistic view of all your Qlik Sense applications, including granular level detail of an applications data model and its resource utilization.
For more information, see Monitoring a Qlik Sense Enterprise on Windows site.
Large data volumes
You can employ these architecture strategies when you are connecting to large data volumes.
Segmentation
You can segment QVDs by dimensions such as time frame, region or aggregation level. For example, you can have:
- A QVD that contains data from the two most recent years.
- A QVD that contains historical data further than two years.
-
A QVD that contains all data aggregated on a higher level. For example, per month instead of date, or per country instead of individual customers.
- One large QVD with all the data, which is only used by a small subset of users.
You can segment the applications in a similar way. Smaller applications will address the analytical needs of most users. This saves memory.
You can also have multiple applications focused on different regions. This way, users will not open an application with data that they are not interested in or do not have rights to access. Data that is not accessible via section access still affects memory.
On-Demand Application Generation (ODAG)
Qlik Sense on-demand applications give users aggregate views of big data stores. They can identify and load relevant subsets of the data for detailed analysis.
From a user perspective, there are two applications:
- A shopping cart with aggregated data.
- An empty template application used to display detail.
The user makes selections in the shopping cart application. Once a threshold has been met, a custom LOAD script is created which populates the template application with the requested details. For more information, see Managing big data with on-demand applications.
Application chaining
Application chaining (known as document chaining in QlikView) means that there is an aggregated application, which users consume regularly. If a user needs more detail, selections can be passed from the aggregated application to a detail application, so that they can view a lower level of granularity. This saves memory, because users are not loading unnecessary detail. Application chaining can be performed by adding button objects to a sheet. For more information, see Application chaining.
Application chaining is also supported via APIs. For example, you can use the App Integration API to create custom application chaining. For more information, see App Integration API.
Data model performance
These are indicators that can impact data model performance. Each one is a best practice that will improve application usability.
| Action | Description |
|---|---|
|
Synthetic keys removed |
Qlik Sense creates synthetic keys when two or more data tables have two or more fields in common. This may mean that there is an error in the script or the data model. To diagnose synthetic keys, see Synthetic keys. |
|
Circular references removed from data model |
Circular references occur when two fields have more than one association. Qlik Sense will attempt to resolve these by changing the connection to one of the tables. However, all circular reference warnings should be resolved, see Understanding and solving circular references. |
|
Appropriate granularity of data |
You should only load data that is necessary. For example: a group of users only need data divided by week, month, and year. You can either load in the aggregated data or aggregate the data within the load script to save memory. If a user does need to visualize data at a lower level of granularity, you can use ODAG or document chaining. |
|
QVDs used where possible |
A QVD is a file containing a table of data exported from Qlik Sense. This file format is optimized for speed when reading data from a script, but is still very compact. Reading data from a QVD file is typically 10-100 times faster than reading from other data sources. For more information, see: Working with QVD files. |
|
QVD files optimized on load |
QVD files can be read in two modes: standard (fast) and
optimized (faster). The selected mode is determined automatically by the script engine. There are some limitations regarding optimized loads. It is possible to rename fields, but any of these operations will result in a standard load:
|
|
Incremental loads leveraged |
If your application connects to a large amount of data from databases that are continuously updated, reloading the entire data set can be time consuming. Instead, you should use incremental load to retrieve new or changed records from the database. For more information, see Loading new and updated records with incremental load. |
|
Snowflake model consolidated |
If you have a snowflake data model, you may be able to reduce the number of data tables by joining some of them using the Join prefix or other mapping. This is especially important for large fact tables. A good rule of thumb is to have only one large table. For more information see To Join or Not to Join. |
|
Tables that have a small number of fields are denormalized |
If you have two tables with few fields, it may improve performance to join them. For more information, see Next steps in scripting. |
|
Denormalized lookup (leaf) tables with mapping loads |
You should not use the Join prefix if you only need to add one field from a table to another. You should use the ApplyMap lookup function, see Don't join - use ApplyMap. |
|
Time stamps removed or decoupled from date field |
Date fields can fill up space when the timestamp is present as the string representation is larger, and the number of distinct values is larger. If the precision is not necessary for your analysis, you can round the timestamp to e.g. the nearest hour using Timestamp(Floor(YourTimestamp,1/24)) or remove the time component completely using Date(Floor(YourTimestamp)). If you want the timestamp, you can decouple it from the date itself. You can use the same Floor() function, and then create a new field with the extracted time by using something along the lines of: Time(Frac(YourTimestamp)). |
|
Unnecessary fields removed from data model |
You should only load necessary fields in your data model. Avoid using Load * and SELECT. Make sure you keep:
|
|
Link tables avoided when dealing with high data volumes |
You should use link tables where possible. However, if you are dealing with large data volumes, concatenated tables can out-perform link tables. |
|
Concatenated dimensions broken to new fields |
You should break apart concatenated dimensions into separate fields. This reduces the number of unique occurrences of values in your fields. This is similar to how timestamps can be optimized. |
|
AutoNumber used where possible |
You can create an optimized load by loading your data from a QVD file first, and then using the AutoNumber statement to convert values to symbol keys. For more information, see AutoNumber. |
|
Data islands avoided |
Data islands can be useful, but they usually affect performance. If you are creating islands for selection values, use variables. |
|
QVDs are stored based on incremental timeframes |
You should store QVD in segments, such as monthly. These smaller monthly QVD can then support many different applications that might not need all of the data. |
Sheet performance
These are best practices that will improve performance of sheets and visualizations.
| Action | Description |
|---|---|
|
The If() function is avoided where possible |
If the If() function is used inside an aggregation function, it will operate at the record level and be evaluated many times. For example, if you have 1000 records in an aggregation, an If() condition will be evaluated 1000 times. This could cascade rapidly if you nest statements. You should use set analysis instead. A set analysis filter is applied before the aggregation, resulting in a faster response. These responses can also be cached via set analysis, where If() cannot. You could also consider other functions and modifications to the data model. |
| Fields from different tables inside an aggregation table are avoided where possible. |
When an aggregation is evaluated, the calculation runs through two steps:
The single-threaded part can affect performance considerably. One example is if you have multiple fields inside the aggregation, for example, Sum(Quantity*ListPrice). If Quantity is in the fact table, and ListPrice is in the master products table, the engine first needs to join the two tables to find the combinations before it can start to sum the product. The joining is the single-threaded part, and the summing is multi-threaded. If both fields are found in the same table, no join is necessary, and the aggregation is evaluated considerably faster. |
|
Aggr() and nested Aggr() functions are used minimally |
The Aggr() function greatly affects performance. Incorrect use can give inaccurate results. For example, in a table with dimensions that vary from the dimensions within the Aggr() function. For more information, see When should AGGR not be used? |
|
Set analysis is used where possible |
You can use set analysis to define a set of data values that is different from the normal set defined by the current selections. For more information, see Set analysis. |
|
String comparisons avoided where possible |
String comparisons are not as efficient as set analysis. For example, you should avoid Match(), MixMatch(), WildMatch(), and Pick(). Create flags in the script or use set analysis instead. For more information, see Conditional functions and Performance of conditional aggregations. |
|
Calculation conditions are used on objects containing intensive calculations |
You may have visualizations with many records when there are no selections. As a best practice, add calculation conditions to objects so that they only render after certain selections have been made. This stops the creation of very large hypercubes. For example: GetSelectedCount([Country])=1 OR GetPossibleCount([Country])=1. In this scenario, the visualization will not render unless the user selects a single country, or makes other selections where only a single country is possible. |
|
Measures are pre-calculated in the script where possible |
Any measure that is at the lowest level of granularity of the data model should be calculated in the script. For example, if in the same record in a table you have Sales and Cost, you could derive the margin by calculating Sales - Cost AS Margin. You can also aggregate other values in advance if you know that they will not vary based on selection, or that are bound to a different level of granularity. |
|
Tables have less than 15 columns and have calculation conditions |
A table with 15 columns could be considered wide. If your tables consist of many records, you should use calculated conditions on the table object so that it only renders after certain selections or criteria have been met. If your table is very wide, consider:
|
|
Sheets do not have an excessive number of objects |
Objects are calculated when the user navigates to the sheet. Every time a user makes a selection on that sheet, each object will be recalculated if that current state does not exist in the cache. If you have a sheet with many charts, the user will have to wait for every object to calculate on nearly every selection. This puts significant load on the engine. As a best practice, follow the Dashboard/Analysis/Reporting (DAR) concept to develop a clean and minimal application. For more information, see DAR methodology. |
|
Numeric flags are leveraged in the script for use in set analysis |
Set analysis with flags can be more efficient than using string comparisons or multiplication. |
| Master items enable drag and drop of governed metrics and guarantee that expressions will be cached. For example, Sum(Sales) is different from SUM(Sales). Expressions are cached on spelling and case, and need to match verbatim in order to be reused. |