Skip to main content Skip to complementary content

FILE: New source

Adding a new FLAT FILE entity to a new source

Select radio button: To new Source

This will kick off the Add Data wizard to create a new source and define a new entity within the source.

Source wizard select add data to new source

Choose Connection: The default connection for discovering local flat files is FILE_LOCALFILE_CONNECTION though any FILE CONNECTION TYPE Protocol is supported.

Add New Source Name: Configurable. This is the Source that the new Entity will be added to.

Default Entity Level: Specify level of data management

(see System Settings: Level Control)

Select Source Hierarchy: Configurable. Choose from dropdown.

Inbound Protocol: Pre-defined Inbound Protocol auto-populates for the selected Source Connection

Base Directory: Configurable. This is the directory where data will be stored in File System. [This value is specified in core_env property: localfile.base.dir.source.connection. This property limits access and browsing to subdirectories of the specified location.]

Groups: Select Group(s) requiring access from the dropdown options. At least one group must be added for the data to be discoverable.

Click Next.

Complete source connection fields

File Browser for discovery of directories for base directory (Select folder icon)

Browse base directory subfolders

File Format

Entity Name: Enter a new Entity Name. Note that the application will not allow two Entities with duplicate names under the same Source.

Entity File Path:  Enter the exact file path to the originating source, ending with the file name and file format suffix. Select File Browser to explore directories for the appropriate source file. File browser functionality is available for data file selection for all file-based sources except Kafka and OpenConnector sources.

Data Format Specification*: Choose a mechanism to Generate Record Layout. When Source or Sample Files are attached, appropriate metadata is extracted and applied to the environment. These files are uploaded from the user's local machine.

  • FDL: File Definition Language specification document

  • SAMPLE FILE: This file can provide information about Header/Trailer, Field (Column) Names, Data Type, delimiters, terminators, business date, etc. (essentially the same information that is passed via FDL documents)

  • SOURCE FILE: The Source file (path provided in previous step) is read to define structural entity, record, and field metadata.

[No file is provided, the source is being used to generate the layout]

  • NONE: This is an option if an analyzer file is not available. Structural elements can be manually defined in following steps.

Generate Record Layout to configure the following specifications.

Users are able to overwrite format specifications.

Record Layout Preview Scroll to bottom of dialog to generate and preview record layout.

*Note source file limitations:

Parquet and AVRO files: Only the Source option works for metadata detection for Parquet and ORC files. (The Sample option is only supported for text files.)

ORC files: Metadata detection/data loading is not supported from ORC source files (though ORC target file storage format is supported).

Entity ingest name, filepath, metadata method

Local file browser for discovery of local server directories (Select folder icon)

Browse directories for data file

File Information

In most cases, users will have uploaded a file in the previous step. that will be parsed and applied to the following layout specifications.

Character Encoding:

The most common record layout is UTF_8.

Options include:

  • US_ASCII
  • EBCDIC_037
  • LATIN_1
  • WINDOWS_1252
  • UTF_8
  • UTF_16LE
  • UTF_16BE
  • UTF_32LE
  • UTF_32BE
  • UTF_8_OR_LATIN_1

Record Layout

The most common record layout is VARIABLE_CHAR_LENGTH_TERMINATED.

Options include:

  • FIXED_BYTE_LENGTH
  • FIXED_BYTE_LENGTH_TERMINATED
  • VARIABLE_CHAR_LENGTH_TERMINATED
  • MAINFRAME_VARIABLE_BDW_RDW
  • MAINFRAME_VARIABLE_RDW_ONLY
  • PARQUET
  • AVRO

Specify file information

Header Information

Byte Count or Line Count: Default is Null (empty) though many tables have a header (Line Count=1)

Specify header information

Record Information

Record information format options:

  • Field Delimiter*:

    • Tab (Default)

      (\t)

    • Comma

      ,

    • Pipe

      |

    • Semi-colon

      ;

    • Colon

      :

    • Ctrl+A

      (\x01)

    • Space

      (\x20)

    • Double Pipe

      ||

    • Pipe Tilde

      |~

    *Any character(s) can be used as the delimiter in data. The delimiter can be coded however the incoming file character set encodes that character.

    For example, if the delimiter is the broken pipe character: ¦, the delimiter can be encoded for the incoming character set:

    UTF-8 (hex): 0xC2 0xA6 (c2a6)

    UTF-16 (hex): 0x00A6 (00a6)

    Latin-1: \xA6

    Unicode: U+00A6

  • Record Terminator:

    • Newline

      \n

    • Carriage Return (optionally preceding)

      /Newline

      \r?\n

    • Return/Newline

      \r\n

    • Any Newline

      ANY_NEWLINE

      (default on ingest)

  • Field Open Enclosure:

    • Single Quote

      '

    • Double Quote

      "

    • Angular Brackets

      << 

  • Field Close Enclosure:

    • Single Quote

      '

    • Double Quote

      "

    • Angular Brackets

      << 

  • Record Validation Regex: Standard Regex arguments
  • Quote Scheme:
    • NONE
    • DOUBLE-EMBEDDED ENCLOSURE
  • Minimum Byte Count: Default=1
  • Maximum Byte Count: Default=65536
Tip note

'Field Open Enclosure', 'Field Close Enclosure', or 'Quote Scheme' specify that the content within the enclosure is part of a string so that other characters (like commas) within the enclosures are not processed as delimiters. This can be an important piece of metadata. If records return as 'bad' and the information messages indicate that records have more fields than they should, change this setting in the metadata panel and reload the data.

Specify record information

Trailer Information

Default is null

Specify trailer information

Internal file formatType

Options include:

  • AVRO
  • ORC
  • ORC_ALL_STRING
  • PARQUET
  • PARQUET_ALL_STRING
  • TEXT_TAB_DELIMITED

Qlik Catalog Stored Format Type step is skipped for Single Server environments where Stored Format Type is always TEXT_TAB_DELIMITED.

Specify stored file format

Record Layout Preview

Note that fields can be added or removed to the layout.

Save the Entity.

Information note

When saving a FlatFile source has a large number of columns (behavior seen with columns @5k), the Add Entity stage + Save Entity popup displays with loading spinner this stage can be very slow (several minutes, up to 5) and may appear to be stuck.

The following fields will populate and can be edited (column order will vary based on user configuration):

  • Name (required)
  • Index (user-defined if manually entered or auto-populated from source doc)
  • Business Description (optional)
  • Technical Description (optional)
  • Data Type (required)
  • Required: NOT NULL where a NULL value will be marked as UGLY (Default is 'false')
  • Encrypted at Source (Default is 'false')
  • Key: Primary Key (Default is 'false')
  • Validation Regex (Default is' ' [empty])
  • Foreign key (Default is 'false')

Record Layout Preview

Record layout preview allows field removal or additional fields to be added

Load data to the entity. Once the entity has been saved, the entity opens in entities grid with option to load the data from More dropdown.

Select Load from the More dropdown.

Load data to the entity

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!