Skip to main content

Component-specific settings for tHBaseInput

The following table describes the Job script functions and parameters that you can define in the setSettings {} function of the component.

Function/parameter Description Mandatory?

USE_EXISTING_CONNECTION

Set this parameter to true and specify the name of the relevant connection component using the CONNECTION parameter to reuse the connection details you already defined.

No

DISTRIBUTION

Specify a cluster distribution. Acceptable values:

  • APACHE
  • CLOUDERA
  • HORTONWORKDS
  • MAPR
  • PIVOTAL_HD
  • CUSTOM

If you do not provide this parameter, the default cluster distribution is Amazon EMR.

No

HBASE_VERSION

Specify the version of the Hadoop distribution you are using. Acceptable values include:

  • For Amazon EMR:

    • EMR_5_5_0
    • EMR_5_0_0
    • EMR_4_6_0
  • For Apache:

    • APACHE_1_0_0
  • For Cloudera:

    • Cloudera_CDH5_10
    • Cloudera_CDH5_8
    • Cloudera_CDH5_7
    • Cloudera_CDH5_6
    • Cloudera_CDH5_5
  • For HortonWorks:

    • HDP_2_6
    • HDP_2_5
    • HDP_2_4
  • For MapR:

    • MAPR520
    • MAPR510
    • MAPR500
  • For Pivotal HD:

    • PIVOTAL_HD_2_0
    • PIVOTAL_HD_1_0_1

The default value is EMR_5_5_0.

No

HADOOP_CUSTOM_VERSION

If you are using a custom cluster, use this parameter to specify the Hadoop version of that custom cluster, which is either HADOOP_1 (default) or HADOOP_2.

No

ZOOKEEPER_QUORUM

Type in the name or the URL of the Zookeeper service you use to coordinate the transaction between your Studio and your database.

Note that when you configure the Zookeeper, you may need to explicitly define the path to the root znode that contains all the znodes created and used by your database by using the SET_ZNODE_PARENT and ZNODE_PARENT parameters.

Yes

ZOOKEEPER_CLIENT_PORT

Type in the number of the client listening port of the Zookeeper service you are using.

Yes

SET_ZNODE_PARENT

When needed, set this parameter to true and specify the path to the root znode using the ZNODE_PARENT parameter.

No

USE_KRB

If the database to be used is running with Kerberos security, set this parameter to true and then specify the principal names using the HBASE_MASTER_PRINCIPAL and HBASE_REGIONSERVER_PRINCIPA parameters.

No

USE_KEYTAB

If you need to use a Kerberos keytab file to log in, set this parameter to true and specify the principal using the PRINCIPAL parameter and the access path to the keytab file using the KEYTAB_PATH parameter.

No

USE_MAPRTICKET

If this cluster is a MapR cluster of the version 4.0.1 or later, you may need to set the MapR ticket authentication configuration by setting this parameter to true and providing relevant information using the MAPRTICKET_CLUSTER, MAPRTICKET_DURATION, USERNAME, and MAPRTICKET_PASSWORD parameters. For more information, see the section about connecting to a security-enabled MapR cluster in MapR.

No

TABLE

Type in the name of the table from which you need to extract columns.

Yes

SET_TABLE_NS_MAPPING

If needed, set this parameter to true and use the TABLE_NS_MAPPING to provide the string to be used to construct the mapping between an Apache HBase table and a MapR table.

No

DEFINE_ROW_SELECTION

Set this parameter to true and then use the START_ROW and END_ROW parameters to provide the corresponding row keys to specify the range of the rows you want the current component to extract.

No

IS_BY_FILTER

Set this parameter to true to use filters to perform fine-grained data selection from your database, and then use the LOGICAL_OP parameter to define the logical relation between filters. Acceptable values are:

  • MUST_PASS_ONE: at least one of the defined filtering conditions must be satisfied.
  • MUST_PASS_ALL: every defined filtering condition must be satisfied.

No

FILTER {}

Use this function and one or more sets of the following parameters to define one or more filters:

  • FILTER_TYPE: enter the type of filter you need to use. Acceptable values:
    • SingleColumnValueFilter
    • FamilyFilter
    • QualifierFilter
    • ColumnPrefixFilter
    • MultipleColumnPrefixFilter
    • ColumnRangeFilter
    • RowFilter
    • ValueFilter
  • FILTER_COLUMN: enter the column qualifier on which you need to apply the active filter.
  • FILTER_FAMILY: enter the column family on which you need to apply the active filter.
  • FILTER_OPERATOR: enter the operation to be used for the active filter. Acceptable values:
    • NO_OP (default)
    • EQUAL
    • NOT_EQUAL
    • GREATER
    • GREATER_OR_EQUAL
    • LESS
    • LESS_OR_EQUAL
  • FILTER_VALUE: enter the value on which you want to use the specified operator.
  • FILTER_COMPARATOR_TYPE: specify the type of the comparator to be combined with the filter you are using. Acceptable values:
    • BinaryComparator
    • RegexStringComparator
    • SubstringComparator

No

SET_MAPR_HOME_DIR

If the location of the MapR configuration files has been changed to somewhere else in the cluster, that is to say, the MapR Home directory has been changed, set this parameter to true and use the MAPR_HOME_DIR parameter to provide the new home directory.

No

SET_HADOOP_LOGIN

If the login module to be used in the mapr.login.conf file has been changed, set this parameter to true and use the HADOOP_LOGIN parameter to provide the module to be called from the mapr.login.conf file

No

TSTATCATCHER_STATS

Set this parameter to true to gather the processing metadata at the Job level as well as at each component level.

By default, this parameter is set to false.

No

LABEL

Use this parameter to specify a text label for the component.

No

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!