Source connection wizard

Source connections are defined as independent top-level objects within the application and associated with sources as a prerequisite of source configuration. Analysts can apply an existing source connection to new sources or create a new source connection that can be reused. This connection object model enables flexibility; entities inherit a default source connection from a source but that connection can be switched (over-written) at the entity level. Consider the following use case: A user may want to change an entity source connection from JDBC to Sqoop so that source remains JDBC source but the data transfers via Sqoop (a faster more efficient data transport utility). Or a user may want to switch from JDBC source to FILE and reroute the data directly from HDFS.

Note that QVD sourceconnections show up in source connection lists as READ-ONLY connections; QVD source connections can only be defined and edited by administrators through QVD Import configuration in the administration module.

The wizard supports all connection types (JDBC, FILE, XML, MAINFRAME, SQOOP) and applicable Protocols (ADLS, KAFKA, FTP, SFTP, S3, HDFS, LOCALFILE, OPENCONNECTOR, WASB, WASBS). Source connections can be applied at entity level in bulk with the option to overwrite properties for individual entities.

Users will find five default source connections in a newly installed environment. From Sources grid, select Source Connections button to display the connections available for that environment.

Default Source Connections
Source Connection name	Connection type	Communication protocol	Connection URL (example)	Description
PODIUM_METADATA_CONNECTION	JDBC	JDBC	jdbc:postgresql://localhost:5432/ md_4.0	Metadata repository connection. Use this connection to ingest, test and explore metadata tables.
FILE_LOCALFILE_CONNECTION	FILE	LOCALFILE	file:/	Default server connection for Flat File ingest
MAINFRAME_LOCALFILE_CONNECTION	MAINFRAME	LOCALFILE	file:/	Default server connection for Mainframe ingest
XML_LOCALFILE_CONNECTION	XML	LOCALFILE	file:/	Default server connection for XML ingest
JSON_LOCALFILE_CONNECTION	FILE	LOCALFILE	file:/	Default server conection for JSON ingest

Five default source connections are listed — Default source connections

Add a new source connection

From sources screen select Source Connections

Source connections button is located in the top right of the sources grid — Sources grid

From source connection page, options include selecting New Source Connection button above the grid to add a new source connection, or selecting icon delete (delete) on the connection row to delete that connection, or by selecting icon view or edit View/Edit on the connection row to modify an existing connection.

New source connection button is directly above the listed source connections — Source Connections page

Source connection functionality can also be accessed and edited by selecting from the actions dropdown on individual entity rows (Select More then Source Connection)

Modify an entity's source connection from entity row, select more and then source connection — Actions dropdown for individual entities

1. Adding a source connection

Users must enter the following information to initiate a new connection after selecting New Source Connection button: Connection Name, Description, Groups access is common for all connection types upon creation.

Connection Name: User defined

Description: User defined

Groups: Associate group access by selecting Group names from dropdown

Select groups requiring access to source connection — Associating group access to a source connection

Connection Type: Options {FILE | JDBC | JSON | MAINFRAME | SQOOP | XML}

Connection type dropdown — View connection types

If the connection type is JDBC, JDBC Source Type dropdown will display to the right. Registered JDBC databases for the application instance populate this dropdown.

If the connection type is FILE, JSON, XML, or MAINFRAME, Protocol options dropdown will display:.ADLS, FTP, HDFS, KAFKA, LOCALFILE, OPENCONNECTOR, S3, SFTP, WASB, and WASBS.

If the Connection Type is Sqoop, no dropdown appears, because Sqoop is also a transport protocol.

If connection type is JDBC available databases will display in dropdown — JDBC Source Types

Protocol options for file json xml and mainframe connection types — FILE, JSON, XML, MAINFRAME Protocols

2. Connection type

This selection drives which connection-specific fields are required in Connection Information. Enter the required fields and Test Connection.

Creation and View/Edit screens are identical with the exception of Create Connection | Update Connection actions in lower right of screen.

Connection type: JDBC

Once connection type is entered, a dropdown for database (MySQL, Oracle, Teradata, etc.) displays to the right and the JDBC connection information fields appear below.

Connection Information fields

Connection String:

Example:

jdbc:oracle:thin@<HOST>:<PORT>:<DATABASE_NAME>

Example:

jdbc: db2://172.24.0.20:50000/sample

User Name: Optional, user defined (can be changed upon edit of source connection)

Password: Optional, user defined (can be changed upon edit of source connection)

Select Test Connection button (Note that Sqoop does not provide a test connection feature)

Test connection after populating information fields — Testing connection

Properties tab is to the right of Connection Information tab and provides the option to add (Add Property) additional key/value metadata upon source connection creation or update if source-specific connection properties have been defined for that connection type. Once the source connection is created, several of these properties and values will be populated.

Be sure to select Create Connection when required fields have been configured.

Properties tab provides option to add properties — Source Connection properties, add property

After successful creation of source connection, additional properties and credentials populate and display: Connection Source Type, User Name, Password (encrypted), Connection String.

Connection specific properties populate upon connection creation — Additional properties and credentials

Connection Types

Connection type: File

Protocols: Options {ADLS | FTP | HDFS | KAFKA | LOCALFILE | OPENCONNECTOR | S3* | SFTP | WASB | WASBS}

Connection Information:

Connection String: file:/ or hdfs:// or s3a:// or adl:// (ADLS Gen1) or abfs[s]:// (ADLS Gen2), etc., depending on protocol

User Name: Optional, user defined (can be changed upon edit of source connection)

Password: Optional, user defined (can be changed upon edit of source connection)

*AWS S3 schema: Scheme s3 should be used when Qlik Catalog is run in EMR (where it is maintained by AWS and is up-to-date). Scheme s3a should be used otherwise. Scheme s3n is obsolete and should not be used.

OPENCONNECTOR protocol is used only to land flat files into HDFS LoadingDock for ingest.

File connection through localfile protocol — Connection Type: File, Protocol: LOCALFILE (configuration example)

Field

Value

Connection Type FILE

Protocol LOCALFILE

Connection Information

Connection String file:/

User Name (optional)

Password (optional)

Connection Type: File, Protocol: LOCALFILE (screenshot example)

Connection Type: File, Protocol: LOCALFILE (configuration example)
Field	Value
Connection Type	FILE
Protocol	LOCALFILE
Connection Information
Connection String	file:/
User Name	(optional)
Password	(optional)

File connection though ADLS Gen2 protocol — Connection Type: File, Protocol: ADLS Gen2(configuration example)

Field

Value

Connection Type FILE

Protocol ADLS

Connection Information

Connection String syntax: abfss://<container>@<storage_account_name>.dfs.core.windows.net/
example: abfss://abfstest@exadlsgen2test1.dfs.core.windows.net/

User Name <storage_account_name>

Password access key

Connection Type: File, Protocol: ADLS Gen2(screenshot example)

Connection Type: File, Protocol: ADLS Gen2(configuration example)
Field	Value
Connection Type	FILE
Protocol	ADLS
Connection Information
Connection String	syntax: abfss://<container>@<storage_account_name>.dfs.core.windows.net/ example: abfss://abfstest@exadlsgen2test1.dfs.core.windows.net/
User Name	<storage_account_name>
Password	access key

ADLS Gen2 notes

Note that the <storage_account_name> value used to build the connection string is the User Name entry.
The storage account in MS Azure must be created with specific options. On the Advanced tab in storage account settings: In the Security section, Enable storage account key access must be checked; and in the Data Lake Storage Gen2 section, Enable hierarchical namespace must be checked.

MS Azure storage setting requirements

File connection through FTP protocol — Connection Type: File, Protocol: FTP (configuration example)

Field

Value

Connection Type FILE

Protocol FTP

Connection Information

Connection String ftp://192.168.126.xx:xx

User Name ftpuser

Password xxxxxxx

Connection Type: FILE, Protocol: FTP (screenshot example)

Connection Type: File, Protocol: FTP (configuration example)
Field	Value
Connection Type	FILE
Protocol	FTP
Connection Information
Connection String	ftp://192.168.126.xx:xx
User Name	ftpuser
Password	xxxxxxx

File connection through kafka protocol — Connection Type: File, Protocol: Kafka (configuration example)

Field

Value

Connection Type FILE

Protocol KAFKA

Connection Information

Connection String kafka://hostname.corp.namedata.com:9092

User Name (optional)

Password (optional)

Connection Type: FILE, Protocol: Kafka (screenshot example).

Connection Type: File, Protocol: Kafka (configuration example)
Field	Value
Connection Type	FILE
Protocol	KAFKA
Connection Information
Connection String	kafka://hostname.corp.namedata.com:9092
User Name	(optional)
Password	(optional)

Kafka notes

Qlik Catalog uses batch processing for Kafka loads like all other load protocols. The application reads directly from Kafka during the receiving step; the loading dock step is skipped.
Kafka cannot be used for new sources, it can only be applied for loads of source environments that are already in the application.
Users can specify a Kafka host, topic, offset and optional partition number. These properties apply at the source connection and entity level.
Kafka is the only load type that never consumes a file in the entire load process. Data starts on Kafka and gets loaded into memory.
Kafka is only supported for FILE, JSON, XML Source types (not mainframe or JDBC)
Archiving is not supported for Kafka even if enabled via enable.archiving property. Kafka is already residing on the cluster and servers as the record of original data.

Setup for Kafka ingest involves establishing the source connection, then establishing the entity and specifying the Kafka Topic Name in Add Data wizard. The URI field or connection.string (below) is required and this should be an established Kafka broker. Once the entity is created, Kafka entity-level properties (shown below) can be set or reset. Kafka properties are also detailed in SourceàSource Properties in webhelp.

Kafka source connection default properties populate upon entry of qualified Kafka URI:

kafka.max.total.records=1000 (default value). Maximum number of records to consume from Kafka in one load.
kafka.poll.timeout.ms=10000 (default value). Timeout for one poll of the Kafka queue. This each poll tries to consume up to 100 records. Several polls may be made per load.
kafka.record.buffer.size=1000 (default value). Size, in records, of the buffer while pulling data from Kafka.
kafka.start.offset=END (default value). Can be one of: END, BEGINNING, <long integer value>). Starting position for the next Kafka load.

Kafka protocol properties — Kafka properties in Source Connection

Protocol: Kafka (properties example)

Property

Value

conn.user.password #0(poirxxxxxxxxxxxxxxxx)

connection.string kafka://hostname.corp.namedata.com:9092

kafka.max.total.records 1000

kafka.poll.timeout.ms 10000

kafka.record.buffer.size 1000

kafka.start.offset END

Protocol: Kafka (properties screenshot example)

Protocol: Kafka (properties example)
Property	Value
conn.user.password	#0(poirxxxxxxxxxxxxxxxx)
connection.string	kafka://hostname.corp.namedata.com:9092
kafka.max.total.records	1000
kafka.poll.timeout.ms	10000
kafka.record.buffer.size	1000
kafka.start.offset	END

Connection type: XML

Protocols: Options {ADLS | FTP | HDFS | KAFKA | LOCALFILE | S3 | SFTP | WASB | WASBS}

Connection Information:

Connection String: file:/ or hdfs:// or s3a:// or adl:// (ADLS Gen1) or abfs[s]:// (ADLS Gen2), etc., depending on protocol

User Name: Optional, user defined (can be changed upon edit of source connection)

Password: Optional, user defined (can be changed upon edit of source connection)

XML connection through localfile protocol — Connection Type: File, Protocol: FTP (configuration example)

Field

Value

Connection Type XML

Protocol HDFS

Connection Information

Connection String hdfs://hostname-nameservice:8020/

User Name (optional)

Password (optional)

Connection Type: XML, Protocol: HDFS (screenshot example)

Connection Type: File, Protocol: FTP (configuration example)
Field	Value
Connection Type	XML
Protocol	HDFS
Connection Information
Connection String	hdfs://hostname-nameservice:8020/
User Name	(optional)
Password	(optional)

Connection type: JSON

Protocols: Options {ADLS | FTP | HDFS | KAFKA | LOCALFILE | S3 | SFTP | WASB | WASBS}

Connection Information:

Connection String: file:/ or hdfs:// or s3a:// or adl:// (ADLS Gen1) or abfs[s]:// (ADLS Gen2), etc., depending on protocol

User Name: Optional, user defined (can be changed upon edit of source connection)

Password: Optional, user defined (can be changed upon edit of source connection)

JSON connection through SFTP protocol — Connection Type: JSON, Protocol: FTP (configuration example)

Field

Value

Connection Type JSON

Protocol SFTP

Connection Information

Connection String sftp://192.168.xxx.xx

User Name ftpuser

Password xxxxxxxxxx

Connection type: JSON, Protocol: FTP (screenshot example)

Connection Type: JSON, Protocol: FTP (configuration example)
Field	Value
Connection Type	JSON
Protocol	SFTP
Connection Information
Connection String	sftp://192.168.xxx.xx
User Name	ftpuser
Password	xxxxxxxxxx

Connection type: MAINFRAME

Protocols: Options {ADLS | FTP | HDFS | LOCALFILE | OPENCONNECTOR | S3 | SFTP | WASB | WASBS}

Connection Information:

Connection String: For example: s3a://test.nnnvpn.com/

User Name: Optional, user defined (can be changed upon edit of source connection)

Password: Optional, user defined (can be changed upon edit of source connection)

Mainframe connection through S3 protocol — Connection type: Mainframe, Protocol: S3 (example)

Connection Type: Mainframe, Protocol: S3 (configuration example)

Field

Value

Connection Type MAINFRAME

Protocol S3

Connection Information

Connection String s3a://test.nnnvpn.com

User Name (optional)

Password (optional)

Connection type: Mainframe, Protocol: S3 (screenshot example)

Connection Type: Mainframe, Protocol: S3 (configuration example)
Field	Value
Connection Type	MAINFRAME
Protocol	S3
Connection Information
Connection String	s3a://test.nnnvpn.com
User Name	(optional)
Password	(optional)

Connection type: SQOOP

(No protocol—Sqoop is the transport utility)

Connection Information:

Connection String: jdbc:<db_type>://<HOST>:PORT>;<databaseName>

User Name: Optional

Password: Optional

Sqoop connection through Sqoop protocol — SQOOP Connection information (Note that Sqoop does not provide a Test Connection feature)

Connection Type: Sqoop, Protocol: (Sqoop) (configuration example)

Field

Value

Connection Type SQOOP

Protocol [SQOOP]

Connection Information

Connection String jdbc:oracle:thin@xxx.xxx.xxx.xx.xxxx:XE

User Name (optional)

Password (optional)

Connection type: Sqoop, Protocol: (Sqoop) (screenshot example)

Connection Type: Sqoop, Protocol: (Sqoop) (configuration example)
Field	Value
Connection Type	SQOOP
Protocol	[SQOOP]
Connection Information
Connection String	jdbc:oracle:thin@xxx.xxx.xxx.xx.xxxx:XE
User Name	(optional)
Password	(optional)

SQOOP Connection Properties:

conn.sqoop.mappers.count: User-defined (default = '4')
conn.sqoop.schema.name: Name of source table, manually defined by user—some databases will not require (or allow) schema name and so this value will be null for some databases.
conn.sqoop.table.name: Users can enter <$EntityName> if the source connection applies to multiple entities. A table name can be entered if the source connection will always be applied to the same table.
conn.user.name: Entered/populated upon creation of source connection
conn.user.password: Entered/populated upon creation of source connection
connection.string: Location of source table

Sqoop protocol properties — Example values for Sqoop connection properties

Protocol: Sqoop (properties example)

Property

Value

conn.sqoop.mappers.count 4

conn.sqoop.schema.name Value

conn.sqoop.table.name $EntityName

sqoop.enclosed-by "

sqoop.null-non-string " "

sqoop.null-string " "

Protocol: Sqoop (properties screenshot example)

Protocol: Sqoop (properties example)
Property	Value
conn.sqoop.mappers.count	4
conn.sqoop.schema.name	Value
conn.sqoop.table.name	$EntityName
sqoop.enclosed-by	"
sqoop.null-non-string	" "
sqoop.null-string	" "

Apply new ingest source connection to an entity

Users are able configure source connections so that one interrogates the environment to configure technical metadata environment and then a second Ingest Source Connection is applied to entities for data ingest. The following example details a case where a source is discovered and an entity created through browsing Hive via JDBC source connection to establish the metadata and then an ingest connection routes data directly from HDFS.

The following example illustrates this process. An entity has been created by staging through the JDBC source creation wizard after applying a source connection where Connection Type is JDBC and Source Type is Hive (Hive connection value is found in About: Settings: DISTRIBUTION: distribution.uri). This process imports the technical metadata.

JDBC connection through Hive protocol — Connection Type: JDBC, Protocol: Hive (configuration example)

Field

Value

Connection Type JDBC

Protocol HIVE

Connection Information

Connection String
jdbc:hive2://hostname-cloudera.corp.domaindata.com:10000/default;principal=hive/hostname-cloudera.corp.domaindata.com@HOSTNAME.REALM

User Name myname

Password xxxxxx

Connection Type: JDBC, Protocol: Hive (screenshot example)

Connection Type: JDBC, Protocol: Hive (configuration example)
Field	Value
Connection Type	JDBC
Protocol	HIVE
Connection Information
Connection String	jdbc:hive2://hostname-cloudera.corp.domaindata.com:10000/default;principal=hive/hostname-cloudera.corp.domaindata.com@HOSTNAME.REALM
User Name	myname
Password	xxxxxx

Using this source connection, a user can browse using the JDBC wizard, search for and create an entity under that source.

BEFORE ingesting data to that new entity, apply new source connection. For this example, a new source connection is created with Source Type: File and Communication Protocol: HDFS for ingest from HDFS.

HDFS connection string is found in About: Settings: LOADING DOCK: loadingdock.uri.

This connection will access HDFS directory. (Note that entity-level src.file.glob property will have to be updated)

Select the entity in source, Select General Information, at the bottom of that modal select Use Different Source Connection for Ingest.

Apply the new Source Connection (ex.name: 'FILE_HDFS') for ingest and Save.

Therefore, entity information will show the Source Connection as JDBC_Hive and the Ingest Source Connection as FILE_HDFS.

Source connection and selection of ingest source connection

src.file.glob entity property must be updated to the file system path where the data will be ingested from (NOTE if this is a split file (multiple files) use a wildcard like split-00*)

property src.file.glob must be updated

Update and save the entity property src.file.glob filepath value, example: /temp/qdc/nemo/receiving/employees/departments_profile/20200409145757/good/good-m-00000

Data can now be loaded to the entity from specified path.

Apply new Source Connection to multiple entities

Qlik Catalog allows the application of a source connection to one or more entities. Multi-select entities and then select Source Connection button from the entities grid.

Bulk entities can be configured with one source connection configuration — Selection of multiple entities for application of a Source Connection

Apply screenshot modal dropdown and list of entities requiring property updates — Apply Source Connection modal provides dropdown of available source connections to apply to multiple entities . The modal lists which entities require property updates.

Each entity may have additional properties that may require updating upon application of new Source Connection (see below matrix)

Managing Entities with new Source Connections: Properties update

When a new source connection is applied to an entity, mandatory property updates may be required. The following table summarizes the property modifications updated for each scenario.

Many of these changes occur automatically, this table provides information about those changes.

Link to the applicable source connection transition:

SQOOP --> JDBC
JDBC --> SQOOP
JDBC --> FILE
FILE --> FILE
SQOOP --> SQOOP
JDBC --> JDBC

SQOOP --> JDBC
Properties affected	Properties added	Required action/guidelines
	SourceDataBaseTypeName (Source Property )	Manually add property at source level, value should be one of following: ORACLE\|HIVE\|MYSQL\|POSTGRES\|TERADATA\|SQL SERVER\|DB2 Depending upon applied JDBC connection's database type. If JDBC connection (example: jdbc:oracle:thin:@192.168.2.1:1521:X01A) is applied, required property value will be 'ORACLE'
header.validation.pattern.is.regex		false
header.validation.pattern		Manually add header pattern ("<colmnn1name>, <column2name>, etc.") If header.validation.pattern.is.regex= False then value of this property should be blank.
src.file.glob		Manually add: <'SELECT' >
record.close.quote		"
record.open.quote		"
default.field.embedded.enclosure.scheme		DOUBLE_EMBEDDED_ENCLOSURE
header.line.count		1
record.record.terminator		\r\n

JDBC --> SQOOP
Properties affected	Properties added	Required action/guidelines
	conn.sqoop.mappers.count	Number of mappers. Default value is 4.
	conn.sqoop.table.name	Default value = $Entity name. When Sqoop connection applies JDBC source the conn.sqoop.table.name property value will be replaced by corresponding entity name. Use <$EntityName> if the source connection applies to multiple entities. A table name can be entered if the source connection will always be applied to the same entity.
	conn.sqoop.schema.name	Default value is blank/null though some databases will not require (or allow) schema name and so this value will be null for some DBs. This property must be manually added if a schema is needed for the connection.
src.file.glob		<'script'>(automatic)
header.validation.pattern		<blank>
header.validation.pattern.is.regex		false
record.close.quote
record.open.quote
default.field.embedded.enclosure.scheme		NONE
header.line.count		0
record.record.terminator		\n

JDBC --> FILE
Properties affected	Properties added	Required action/guidelines
src.file.glob		Manually add filepath
header.validation.pattern		Should be blank (unless header.validation.pattern.is.regex is set to True)
header.validation.pattern.is.regex		False (default)
record.open.quotes		<blank> (default) Default is blank but depends on datafile (in src.file.glob) If datafile contains records in open/close quotes then user must specify them for this property.
record.close.quotes		<blank> (default) Default is blank but depends on datafile (in src.file.glob) If datafile contains records in open/close quotes then user must specify them for this property.
default.field.embedded.enclosure.scheme		NONE
header.line.count		0 (default) Tip note If the file loads with one ugly record, it might be because the file has a header, in which case try setting this value to 1.
record.record.terminator		ANY_NEWLINE
	src.file.glob (OPENCONNECTOR PROTOCOL only)	<script>( manually add)
	src.file.glob (KAFKA protocol only)	user must manually enter topic name (delete any pre-existing value/query)
	entity.custom.script.args (OPENCONNECTOR PROTOCOL only)	Manually add this property at the entity level. This is a required property for OPENCONNECTOR protocol when using DistCp or SCP/SSH scripts. Example: (SCP) /root/custom/podium/put_file_hdfs.sh %prop.sfile %prop.starget %loadingDockLocation (DistCP) /user/local/podium/usedistcp.sh %prop.p1 %loadingDockLocation
	Script Properties/Values (OPENCONNECTOR PROTOCOL only) Example: p1 Example: starget	OPENCONNECTOR script arguments. Manually add script argument properties. Note: %loadingDockLocation is automatically generated by the application, this property does not require a manual property entry. Example: SCP: [(sfile, '/datasets/m1_receipt/jan'), (starget,'/staging/m_1receipt.jan')] DistCP: [(p1,'/tmp/dataset/Q3.utf8.txt')]

JDBC --> FILE
Properties affected	Properties added	Required action/guidelines
src.file.glob		Manually add filepath
header.validation.pattern		Should be blank (unless header.validation.pattern.is.regex is set to True)
header.validation.pattern.is.regex		False (default)
record.open.quotes		<blank> (default) Default is blank but depends on datafile (in src.file.glob) If datafile contains records in open/close quotes then user must specify them for this property.
record.close.quotes		<blank> (default) Default is blank but depends on datafile (in src.file.glob) If datafile contains records in open/close quotes then user must specify them for this property.
default.field.embedded.enclosure.scheme		NONE
header.line.count		0 (default) Tip note If the file loads with one ugly record, it might be because the file has a header, in which case try setting this value to 1.
record.record.terminator		ANY_NEWLINE
	src.file.glob (OPENCONNECTOR PROTOCOL only)	<script>( manually add)
	src.file.glob (KAFKA protocol only)	user must manually enter topic name (delete any pre-existing value/query)
	entity.custom.script.args (OPENCONNECTOR PROTOCOL only)	Manually add this property at the entity level. This is a required property for OPENCONNECTOR protocol when using DistCp or SCP/SSH scripts. Example: Examples: (SCP) /root/custom/podium/put_file_hdfs.sh %prop.sfile %prop.starget %loadingDockLocation (DistCP) /user/local/podium/usedistcp.sh %prop.p1 %loadingDockLocation
	Script Properties/Values (OPENCONNECTOR PROTOCOL only) Example: p1 Example: starget	OPENCONNECTOR script arguments. Manually add script argument properties. Note: %loadingDockLocation is automatically generated by the application, this property does not require a manual property entry. Example: Examples: SCP: [(sfile, '/datasets/m1_receipt/jan'), (starget,'/staging/m_1receipt.jan')] DistCP: [(p1,'/tmp/dataset/Q3.utf8.txt')]

FILE --> FILE
Properties affected	Properties added	Required action/guidelines
src.file.glob		Manually add filepath
header.validation.pattern
header.validation.pattern.is.regex		false
record.close.quote
record.open.quote
default.field.embedded.enclosure.scheme		NONE
header.line.count		0 (default) Information note If the file loads with one ugly record, it might be because the file has a header, in which case try setting this value to 1.
record.record.terminator		ANY_NEWLINE
	src.file.glob (OPENCONNECTOR PROTOCOL only)	<'script'>( automatically added)
	src.file.glob (KAFKA protocol only)	user must manually enter topic name (delete any pre-existing value/query)
	entity.custom.script.args (OPENCONNECTOR PROTOCOL only)	Manually add this property at the entity level. This is a required property for OPENCONNECTOR protocol when using DistCp or SCP/SSH scripts. Example: Examples: (SCP) /root/custom/podium/put_file_hdfs.sh %prop.sfile %prop.starget %loadingDockLocation (DistCP) /user/local/podium/usedistcp.sh %prop.p1 %loadingDockLocation
	Script Properties/Values (OPENCONNECTOR PROTOCOL only) Example: p1 Example: starget	OPENCONNECTOR script arguments. Manually add script argument properties. Note: %loadingDockLocation is automatically generated by the application, this property does not require a manual property entry. Example: SCP: [(sfile, '/datasets/m1_receipt/jan'), (starget,'/staging/m_1receipt.jan')] DistCP: [(p1,'/tmp/dataset/Q3.utf8.txt')]

SQOOP --> SQOOP
Properties affected	Properties added	Required action/guidelines
	conn.sqoop.mappers.count	Enter/modify number of mappers. Default value is 4.
	conn.sqoop.table.name	Default value = $Entity name. When SQOOP connection applies JDBC source the conn.sqoop.table.name property value will be replaced by corresponding entity name. Use <$EntityName> if the source connection applies to multiple entities. A table name can be entered if the source connection will always be applied to the same entity.
	conn.sqoop.schema.name	Default value is blank/null though some databases will not require (or allow) schema name and so this value will be null for some databases.
src.file.glob		script
header.validation.pattern
header.validation.pattern.is.regex		false
record.close.quote
record.open.quoter
default.field.embedded.enclosure.scheme		NONE
header.line.count		0
record.record.terminator		\n

JDBC --> JDBC
Properties affected	Properties added	Required action/guidelines
src.file.glob		<unchanged>
header.validation.pattern
header.validation.pattern.is.regex		false
record.close.quote		“
record.open.quote		“
default.field.embedded.enclosure.scheme		DOUBLE_EMBEDDED_ENCLOSURE
header.line.count		1
record.record.terminator		\r\n

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!

Leave your feedback here