Adding a new FLAT FILE entity to an existing source
Select radio button: To existing Source
This will kick off the Add Data wizard to define a new Entity within an existing Source.
Select Source Name from dropdown. This is the Source that the new Entity will be added to.
Entity Level: Specify level of data management
The default level for the source will populate in the dropdown and this can be modified and applied to new entities.
Base Directory: Pre-defined (This is the directory where data will be stored in File System)
Groups: Select Groups Select Group(s) requiring access from the dropdown options. At least one group must be added for the data to be discoverable.
Entity Name: Enter a new Entity Name. Note that the application will not allow two Entities with duplicate names under the same Source.
Entity File Path: Enter the exact file path to the originating source, ending with the file name and file format suffix. Select File Browser to explore directories for the appropriate source file. File browser functionality is available for data file selection for all file-based sources except Kafka and OpenConnector sources.
Data Format Specification*: Choose a mechanism to Generate Record Layout. When Source or Sample Files are attached, appropriate metadata is extracted and applied to the environment. These files are uploaded from the user's local machine.
FDL: File Definition Language specification document
SAMPLE FILE: This file can provide information about Header/Trailer, Field (Column) Names, Data Type, delimiters, terminators, business date, etc. (essentially the same information that is passed via FDL documents)
SOURCE FILE: The Source file (path provided in previous step) is read to define structural entity, record, and field metadata.
[No file is provided; the source is being used to generate the layout]
NONE: This is an option if an analyzer file is not available. Structural elements can be manually defined in following steps.
Generate Record Layout to configure the following specifications.
Users are able to overwrite format specifications.
*Note source file limitations:
Parquet and AVRO files: Only the Source option works for metadata detection for Parquet and ORC files. (The Sample option is only supported for text files.)
ORC files: Metadata detection/data loading is not supported from ORC source files (though ORC target file storage format is supported).
Local File Browser for discovery of Local Server directories (Select Folder icon)
In most cases, users will have uploaded a file in the previous step. that will be parsed and applied to the following layout specifications.
Character Encoding: The most common record layout is UTF_8.
The most common record layout is VARIABLE_CHAR_LENGTH_TERMINATED.
Byte Count or Line Count: Default is Null (empty) though many tables have a header (Line Count=1)
Record information format options:
*Any character(s) can be used as the delimiter in data. The delimiter can be coded however the incoming file character set encodes that character.
For example, if the delimiter is the broken pipe character: ¦, the delimiter can be encoded for the incoming character set:
UTF-8 (hex): 0xC2 0xA6 (c2a6)
UTF-16 (hex): 0x00A6 (00a6)
Carriage Return (optionally preceding)
(default on ingest)
Field Open Enclosure:
- Record Validation Regex: Standard Regex arguments
- DOUBLE-EMBEDDED ENCLOSURE
- Minimum Byte Count: Default=1
- Maximum Byte Count: Default=65536
'Field Open Enclosure', 'Field Close Enclosure', or 'Quote Scheme' specify that the content within the enclosure is part of a string so that other characters (like commas) within the enclosures are not processed as delimiters. This can be an important piece of metadata. If records return as 'bad' and the information messages say that records have more fields than they should, change this setting in the metadata panel and reload the data.
Default is null
Qlik Data Catalyst Stored Format Type step is skipped for Single Server environments where Stored Format Type is always TEXT_TAB_DELIMITED.
Record Layout Preview
Once the Record Layout looks right, select Record Layout Preview
Save the Entity
When saving a FlatFile source has a large number of columns (behavior seen with columns @5k), the Add Entity stage + Save Entity popup displays with loading spinner this stage can be very slow (several minutes, up to 5) and may appear to be stuck.
The following fields will populate and can be edited (column order will vary based on user configuration):
- Name (required)
- Index (user-defined if manually entered or auto-populated from source doc)
- Business Description (optional)
- Technical Description (optional)
- Data Type (required)
- Required: NOT NULL where a NULL value will be marked as UGLY (Default is 'false')
- Encrypted at Source (Default is 'false')
- Key: Primary Key (Default is 'false')
- Validation Regex (Default is' ' [empty])
- Foreign key (Default is 'false')
Load Data to the Entity.
Once the Entity has been saved, navigate to Entities screen in Source to load the data.
Select Load from the More dropdown.