Skip to main content

Prerequisites

Before you begin to work with a Hadoop cluster as a data source in Qlik Replicate, make sure that the following prerequisites have been met:

  • General:

    • The Hadoop WebHDFS must be accessible from the Qlik Replicate machine.
    • The Hadoop Data Nodes must be accessible from the Qlik Replicate machine.
    • The Hadoop WebHDFS service must be running.
    • To access Hive using WebHCat, the Hadoop WebHCat service must be running. Other methods for accessing Hive are described later in this chapter.
    • The user specified in the Qlik Replicate Hadoop target settings must have access to HiveServer2.
  • SSL: Before you can use SSL, you first need to perform the following tasks:

    • Configure each NameNode and each DataNode with an SSL certificate (issued by the same CA).
    • Place the CA certificate on the Replicate Server machine. The certificate should be a base64-encoded PEM (OpenSSL) file.
  • Permissions: The user specified in the Hadoop source settings must have read permission for the HDFS directories that contain the data files.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!