Adding datasets
Add datasets for use from the catalog. You can upload data files directly to Qlik Cloud, where they can be used as datasets or in load scripts. You can also create datasets that reference data stored outside of Qlik Cloud—specifically, these datasets are defined using a data connection.
Adding datasets as local files stored within Qlik Cloud
You can upload data files directly to Qlik Cloud. When you upload a data file into Qlik Cloud, it is also created as a dataset containing metadata that can be accessed with catalog, impact analysis, and lineage tools. For additional considerations regarding distinctions between data files and datasets, see Distinctions between data files and datasets.
For supported file types, see Loading data from files.
Data files can be up to 100 GB. However, when uploading very large data files (over 6 GB), you might experience constraints with engine capacity. These constraints are more likely to be encountered with QVD data files due to the memory usage necessary to load QVD files into the engine. For more information about increasing the capacity available, see Large app support.
Do the following:
-
Go to the Create page of the Analytics activity center and select Dataset.
You can also add datasets from your mini-homes and by clicking Create new > Dataset in Catalog.
- Click Upload data file.
-
Drag and drop your data files into the Add file dialog.
Alternatively, click Browse and navigate to your data files.
-
Specify the Path for the file using the drop down menu. Start by selecting a space, and then navigate to the folder within the space where you want to store the file.
Alternatively, type the full path manually.
- Select a destination space for the files.
-
Click Upload.
Alternatively, to create an app from your dataset immediately, click Upload and analyze.
When importing a dataset file into a Qlik Sense app or space with Data manager (drag-and-drop or other direct uploads), the maximum number of fields that can be loaded is 5000.
Distinctions between data files and datasets
File-based data sources stored locally within Qlik Cloud are, at their source, data files. Each data file also exists as a dataset that can be analyzed and edited using catalog, lineage, and impact analysis tools. Data files and datasets are often referenced as equivalent terms for simplicity. However, important distinctions can be made between these two terms—particularly, when using a data file in load script development.
When you store a data file directly within Qlik Cloud, the resource is created as a data file. Across Qlik Cloud, the same file will be shown in two different ways:
-
In Space details > Data files, the underlying data file is shown.
-
From the general overview in Catalog, as well as in Home, Favorites and Collections, the dataset is shown. Depending on your access and location in the user interface, you might also be able to view the underlying data file for a dataset.
This distinction is important when you edit data files and datasets that are stored within Qlik Cloud. Editing a dataset stored within Qlik Cloud as a file—specifically, renaming it—does not rename the underlying data file. Instead, it simply adds an alias to the dataset. Because analytics content such as apps and scripts reference the underlying data file and not the dataset, you must rename the underlying data file and not its dataset if you need the references to function properly during app and script development.
For more information, see Distinctions between data files and datasets.
Other ways to add data files
There are many other ways in which you can upload data files into Qlik Cloud Analytics. For example, it is often possible to add data as you build specific analytics resources.
Common methods include:
-
When managing a space.
-
When building apps and scripts. You can use Data load editor, Data manager, or the Script interface to upload data files.
Loading data from the data catalog
-
Using the STORE script statement during app and script development.
-
As output from a data flow.
-
As you add training data in an ML experiment.
-
As you create a prediction configuration within an ML deployment.
-
When working with Qlik Answers.
Managing data files in spaces
For more information about managing data files stored locally within Qlik Cloud see Managing data files.
Adding datasets from existing connections
Create datasets from existing ODBC connections. When you create a dataset, you pick a database from the data source and then select tables in that database. A dataset is created for each table you select. Datasets created this way refresh their data every time the dataset is opened.
Creating datasets from connections allows you to use cataloging and lineage options with data from your external data sources.
When you create a dataset from a data connection, the dataset does not contain any underlying data file that can be managed within the space. This is because the dataset is stored outside of Qlik Cloud.
Datasets created from a connection must reside in the same space as the connection to that data source. If the dataset is moved to a space without that connection, only the dataset name and limited metadata is available from the dataset.
Do the following:
-
Go to the Create page of the Analytics activity center and select Dataset.
You can also add datasets from your mini-homes and by clicking Create new > Dataset in Catalog.
-
Select a connection to a data source from the available connections and click Next.
-
Under Database, select the database containing the tables for which you want to create datasets.
-
Under Tables, select the tables to create datasets from. Each table will make a new dataset.
-
Click Next.
-
Select the destination space for the data set from Select space.
If the space does not have access to your selected connection, you need to select Create new connection in <space name>.
-
Click Create datasets.
Adding datasets from new connections
Add a new ODBC connection and create a dataset from it.
When you create a dataset from a data connection, the dataset does not contain any underlying data file that can be managed within the space. This is because the dataset is stored outside of Qlik Cloud.
Datasets created from a connection must reside in the same space as the connection to that data source. If the dataset is moved to a space without that connection, only the dataset name and limited metadata is available from the dataset.
Do the following:
-
Go to the Create page of the Analytics activity center and select Dataset.
You can also add datasets from your mini-homes and by clicking Create new > Dataset in Catalog.
-
Click Create connection.
-
Select the destination space for the connection.
-
Under Data connectors, select the data source.
-
Add the details for the connection.
-
Enter the connection settings for the data source.
For information about supported connections, see Loading analytics data.
-
Click Create.
-
Select the connection from the available connections and click Next.
-
Under Database, select the database containing the tables for which you want to create datasets.
-
Under Tables, select the tables to create datasets from. Each table will make a new dataset.
-
Click Next.
-
Select the destination space for the data set from Select space.
If the space does not have access to your selected connection, you need to select Create new connection in <space name>.
-
Click Create datasets.