Configure a Groundplex for Cloudera Data Hub (CDH) with Kerberos authentication

Kerberos authentication for Groundplexes

Configuring a Groundplex for CDH with Kerberos authentication involves setting up secure communication between the Groundplex nodes and the CDH cluster using Kerberos protocols. This process includes installing Kerberos packages, generating keytab files, and testing the configuration to ensure secure authentication and access control.

Setup Kerberos on Groundplex nodes

This is the initial step where Kerberos packages are installed, configuration files are set up, and keytab files are generated and deployed on the Groundplex nodes. This foundational setup enables secure authentication mechanisms required for further configuration. To set up Kerberos on Groundplex nodes:

  1. Install Kerberos packages on the Groundplex nodes. $ sudo yum install krb5-workstation krb5-libs krb5-auth-dialog
  2. Copy the file /etc/krb5.conf from one of the target cluster nodes to /etc/krb5.conf on each Groundplex node.
  3. Install the JCE extension on each Groundplex node.
    1. Download the JCE extension zip file:http://www.oracle.com/technetwork/java/javase/downloads/jce8-download-2133166.html
    2. Copy the JCE extension zip file onto each Groundplex node and install the JCE extension with the following command. Restart the node after the installation. $ unzip -o -j -q jce_policy-8.zip -d /opt/snaplogic/pkgs/jre1.8.0_45/lib/security/
    3. To check if the JCE extension was correctly installed, run the command: $ zipgrep CryptoAllPermission /opt/snaplogic/pkgs/jre1.8.0_45/lib/security/local_policy.jar default_local.policy: permission javax.crypto.CryptoAllPermission;
  4. Generate the keytab file for the Kerberos user, put it on each Groundplex node, and give snapuser access to the keytab file.
    $ sudo cp /path/to/keytab/file /home/snapuser/<keytab_file_name>
             $ sudo chown snapuser:snapuser /home/snapuser/<keytab_file_name>
             $ sudo chmod 400 /home/snapuser/<keytab_file_name>
  5. To provide additional Hadoop Configuration details from the UI, it should be passed as a JCC Configuration option. For more details on configuration, see, Configuration page. The global properties should be updated to add the following configuration option. Value that jcc.jvm_options points to the HDFS configuration directory. jcc.jvm_options=-DHADOOP_CLIENT_CONF_DIR=<PATH_TO_HDFS_CONF_DIRECTORY> for example: jcc.jvm_options=-DHADOOP_CLIENT_CONF_DIR=/home/snapuser/remote-hadoop/conf
  6. To provide additional Hadoop Configuration details to the JCC from the UI, edit the Snaplex properties in Admin Manager. Go to the Snaplex Node Properties. Under Global Properties add a key named jvm_options with the value: -DHADOOP_CLIENT_CONF_DIR=<PATH_TO_HDFS_CONF_DIRECTORY>

    Node Properties

Setup edge node with Kerberos configuration on Groundplex nodes

Once Kerberos is set up on the Groundplex nodes, the next step involves configuring the edge node, which acts as the interface between the Hadoop cluster and the external network. This setup ensures that the edge node, which can be part of or separate from the Hadoop cluster, is correctly configured to handle Kerberos authentication.

Edge nodes are the interface between the Hadoop cluster and the outside network. The edge node can be a part of Hadoop Cluster or can be outside of the Hadoop Cluster. SnapLogic's suggested configuration for an edge node is to be a part of HDFS Cluster and running the Groundplex on the edge node.

Different Hadoop distributions follows different steps to configure a node as an edge node. Here are the links to setting up the edge node.
  • Cloudera
  • HortonWorks

Test Kerberos configuration on Groundplex nodes

After both the Groundplex nodes and the edge node are configured with Kerberos, it is crucial to test the configuration to verify that the authentication mechanisms are working correctly. This includes running specific commands to check Kerberos tickets and ensuring there are no common issues, thus validating the entire setup.

The following commands can be used to test Kerberos configuration on the Groundplex nodes:
$ kinit -k -t /path/to/keytab/file <principal_name> 
         $ klist
For example, with a keytab file of principal snaplogic/[email protected], you should be able to initialize the ticket cache and see outputs like this:
$ kinit -k -t /home/snapuser/snaplogic.keytab snaplogic/[email protected]          
         $ klist 
         Ticket cache: FILE:/tmp/krb5cc_5112 
         Default principal: snaplogic/[email protected] 
         Valid starting       Expires              Service principal 
         03/27/2017 18:39:59  03/28/2017 18:39:59  krbtgt/[email protected] 
         renew until 04/03/2017 18:39:59