Adding a group
The first step to add a group is to choose a group domain type (such as local, Active Directory, Qlik_Sense) from the domain dropdown, then select Add Group. This action opens the group administration panel.
Note that Active Directory groups can only be imported into Qlik Data Catalyst (not added through the UI).
STEP 1: Add a New Group, Select Sources
- Enter the new Group Name
- Select Sources (available sources and datasets are listed) or check Select All.
STEP 2: Select and add entities. (All available entities for sources and datasets selected are visible.) Entity selection can be made by selecting each entity individually or Select All.
STEP 3: Select and add existing groups to the new group. This step provides the ability to assign personnel to projects by functional groups or on a per project basis.
As groups are added they are appear as Sub-groups.
STEP 4: Associate users and define roles
- Select users to be associated with the new group.
- Define each user's access level for the group. Select User and Select Role with the appropriate access level.
STEP 5: Add Source Connections
- Select All Source Connections that will be associated with and accessible to the Group. The left list contains a list of all source connections in the account, grouped by source connection type (File, JDBC, mainframe, Sqoop, XML). As the admin user selects source connections, the right-hand list populates.
- Save the group.
Administrator settings
Controlling permissions for files and directories under Qlik Data Catalyst control
Optional file system permissions created under Qlik Data Catalyst control under the base directory can be set with explicit control. Corresponding properties can be set at either source or entity level (listed below). Entities inherit from sources by default though the permissions can be set at entity level.
posix.directory.* will be applied to all directories starting at: ./receiving/<source>/<entity>/<load-date>
posix.file.* will be applied to all files starting at:./receiving/<source>/<entity>/<load-date>
Note that for all POSIX permissions and ownerships to work:
- On multi-node cluster environments the user running the job should have the permissions to change ownership of the directory.
- On single node (single server) environments, the application should be running as root because no other user is allowed to "chown" the directories to change the ownership. Note that while changing the file system privileges, all data files should be readable by the PostgreSQL services OS user postgres otherwise no user will be able to read distribution tables.
File system permissions and ownerships look like this:
In the above examples, drwxr-xr-x are the permissions, podium is the user and supergroup is the group. All of these settings can be applied to a whole directory or individual files.
The corresponding settings in Qlik Data Catalyst applicable at entity and source-levels are:
posix.directory.user.identifier: User identifier for receiving/source/entity/partition directory, set for directories during ingestion of entity data. (e.g., podium)
posix.directory.group.identifier: Group identifier for receiving/source/entity/partition directory, set for files during ingestion of entity data. (e.g., supergroup)
posix.directory: permissions Directory permissions for receiving/source/entity/partition directory. (e.g., drwxr-xr-x)
posix.file.user.identifier: User identifier for all files in receiving/source/entity/partition directory. (e.g., podium)
posix.file.group.identifier: Group identifier for all files in receiving/source/entity/partition directory. (e.g., supergroup)
posix.file.permissions: File permissions for all files in receiving/source/entity/partition directory. (e.g., drwxr-xr-x)
These properties can be applied even when impersonation is disabled. However, in multi-node environments when changing the user and group ownerships, the podium service user must have hdfs admin rights. Group identifiers must exist as UNIX groups on the hadoop namenode. In single node environments, user and group ownerships can only be changed if Qlik Data Catalyst is running as root user. Running Qlik Data Catalyst as root is not required when changing the permissions (ex. drwxr-xr-x).
It is not mandatory to set these properties in either single node and multi-node environments. However, in multi-node Hadoop environments, when impersonation is enabled, the file and directory ownerships are set automatically to the logged in user. On single node environments, this does not happen and the file and directory ownerships are set to Qlik Data Catalyst service user ('podium' or 'QDC'). Therefore, on single node environments if ownership must be changed, it has to be done explicitly by setting these properties on every source or entity and running Qlik Data Catalyst as root user. On single node environments, the PostgreSQL service user postgres must always have read permissions on all files, otherwise the distribution tables will not work.
Set extended ACL permissions for directories and files upon ingestion
To provide extended support for HDFS ACL permissions Qlik Data Catalyst now adds support for the following properties:
- posix.directory.acl
- posix.file.acl
These properties allow one to set extended ACL permissions for directories and files (respectively) under the Qlik Data Catalyst root directory in HDFS.
Setting posix.directory.acl is the same as running the following Hadoop command:
Setting posix.file.acl is the same as running the following Hadoop command:
The syntax of the posix.(directory|file).acl settings is the same syntax as required by the command line -setfacl command.
Documentation for the exact syntax plus a broader discussion of ACLs can be found here.
Example: