Skip to main content

OPENCONNECTOR usage and tips

Qlik Catalog supports the use of OPENCONNECTOR scripts to land flat files into HDFS or File System LoadingDock. This is especially useful for using native Hadoop connectors to RDBMS through scripting mechanisms such as DistCp or Secure Copy (SCP). SQOOP transports are supported through Source Connection creation in Source.

  • DistCp transfers data files that already reside on a cluster to File System (LoadingDock) on the same or different cluster

  • SCP is a protocol based on Secure Shell (SSH) used for securely transferring files between a local host and remote host or between two remote hosts.

  • Scripts require the creation of properties. See Creating a property through API call: PUT /propDef/v1/save for details on creating these property. Note details for example payloads 3 and 4.

Example: DistCp script example:

/usr/local/podium/misc/usedistcp.sh %prop.p1 %loadingDockLocation

Example: Secure Copy (SCP/SSH) script example:

/root/custom/podium/put_file_hdfs.sh %prop.sfile %prop.starget %loadingDockLocation

 

OPENCONNECTOR attributes and properties
Attributes Properties
Source Level

Source Type: FILE (will always be file)

Communication Protocol: OPENCONNECTOR

Entity Level entity.custom.script.args

This property provides a field where the path to the script plus any arguments such as password/username and parameters are passed. This property can be set after the data source objects are created through the wizard –as an OPENCONNECTOR. If you define a source and entity through JDBC for example, this property is manually entered after the source and entity are created (%attr.source, #attr.entity).

Definitions Core property: entity.custom.script.args

OPENCONNECTOR script (example): /usr/local/podium/bin/usedistcptest.sh %prop.p1 %loadingDockLocation

%loadingDockLocation – Required argument that every OPENCONNECTOR must take, it is the PATH that the application creates to LoadingDock. For example, this script using DistCp will copy the data file into this location; the value will be automatically generated by the application.

%loadingDockUri – This argument can be used as an optional argument for an OPENCONNECTOR script to provide a fully qualified path or initiate a script launch to a destination on S3. The argument is also used in QVD Import to provide a full URI mount point from which to launch the OPENCONNECTOR script. The argument can provide a temporary folder in a case where the load is appending in S3 (and the application will error if it registers that the destination folder already exists). Note that in this scenario, the location of the temporary root directory must be in the same bucket as the directory. For example: s3a://example-bucket/temporary-rootdir or s3a://example-bucket/target-directory/temporary-rootdir.

example: %loadingDockUri (argument):

entity.custom.script.args=/usr/local/qdc/migrationdir/put_file_fs.sh --temporary-rootdir s3a://dev-landing/temporary-rootdir --target-dir s3a://dev-landing/datacatalyst/loadingdock/DW_PR/INCREMENT/20190722124834/

%prop.p1 –  First argument the script will take

%attr.source.username – Refers to an attribute of the source. For example, attr.source can be username

%attr.source.name

%attr.entity.name

%attr.source.username

%attr.entity.username

/usr/local/podium/putfilehdfs.sh %prop.sfile %loadingDockLocation

/usr/local/podium/putfilehdfs.sh – (example) script to run

%prop.sfile – First argument the script will take: %prop (tells Qlik Catalog to use the value set in property: sfile)

sfile can be anything the user defines as the value. In this example it specifies the input path.

%loadingDockLocation –required argument that every CUSTOM argument must take – it is the PATH that the application creates in LoadingDock – so for example – this script using distcp will copy the datafile into this location.

Information note

Properties should either start with %prop or %attr

%prop should be followed by the name of the property of either a source or the entity. For example, %prop.username should return the connection username.

%attr should be followed by" source" or "entity", then the attribute desired. There are two such attributes: name and username. Name is the name of either the source or the entity, while username is just a different way to access the connection username. Example: %attr.entity.name is the name of the entity.

 

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!