Error handling with Kerberos in Hadoop Ecosystem

Using Kerberos for secure authentication in the Hadoop ecosystem involves setting up and managing users for various components like Hive and HDFS. Proper configuration is essential to avoid common errors and ensure seamless operations.

Create a user for Hive with Kerberos

When setting up a new user for Hive with Kerberos, follow these steps to ensure correct configuration and minimize authentication errors:

  1. Create a Linux user on each CDH cluster node: Add a new Linux user on all nodes within the CDH cluster.
  2. Create a home directory on the CDH cluster's HDFS for the user: Set up a home directory in the Hadoop Distributed File System (HDFS) for the newly created user. Ensure they have the necessary space and permissions for Hive operations.
  3. Create a Kerberos principal for the user: Generate a Kerberos principal to securely authenticate the new user.
Error Handling
  • Error: [Cloudera] HiveJDBCDriver ERROR processing query/statement.
    • Error Code: ERROR_STATE, SQL state: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask, Query: ......
    • Cause: This error indicates a problem with processing a query or statement in Hive.
    • Resolution: In the Snap's account setting, ensure the following in Configure a groundplex for CDH with Kerberos authentication
      • Kerberos Authentication: Ensure all setup steps are correctly followed.
      • User ID Authentication: Ensure the first two setup steps are completed.
      • No Authentication: Create a home directory for an anonymous user in HDFS.
  • Error: Unable to authenticate the client to the KDC using the provided credentials
    • Description: This error suggests that the Kerberos configuration on the Groundplex nodes is incorrect.
    • Solution: Refer to the section Setup Kerberos on Groundplex Nodes to ensure the Kerberos setup is properly completed.
  • Error: Keytab file does not exist or is not readable

    • Cause: This error occurs when the keytab file is missing or the snapuser does not have the necessary permissions to access it.
    • Resolution: Check step 5 in the "Setting Up Kerberos on Groundplex Nodes" section to ensure the keytab file is correctly placed and accessible.

HDFS Cluster Authentication with Kerberos

Ensuring proper Kerberos authentication in an HDFS cluster is crucial to prevent errors and maintain secure access.

Error Handling
  • Error: SIMPLE authentication Not enabled.
    • Error Message: The following exception is displayed as the error message for the Kerberos authentication enable HDFS Snaps.

      java.util.concurrent.ExecutionException: java.io.IOException: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS];

    • Cause: Kerberos authentication requires all hosts to have synchronized internal clocks. If the clocks differ too much (known as clock skew), client requests are rejected.
    • Resolution: It's important to keep the clocks on the Key Distribution Centers (KDCs) and Kerberos clients in sync. Use Network Time Protocol (NTP) software to synchronize them.
  • Error: Server Has Invalid Kerberos Principal
    • Error Message: The following exception is displayed as the error message.

      Failed on local exception: java.io.IOException: java.lang.IllegalArgumentException: Server has invalid Kerberos principal: hdfs/[email protected]; Host Details : local host is: "<LOCALHOST>/127.0.0.1"; destination host is: "cdh2-1.devsnaplogic.com":8020;

    • Cause: When configuring Kerberos on the HDFS server, the following properties are added, the following properties are added:
      <property>
        <name>dfs.namenode.kerberos.principal</name>
        <value>hdfs/[email protected]</value>
      </property>
       
      <property>
        <name>dfs.datanode.kerberos.principal</name>
        <value>hdfs/[email protected]</value>
      </property>

      The string _HOST in the properties is replaced at runtime by the fully-qualified domain name (FQDN) of the host machine where the daemon is running. For this replacement to work correctly, reverse DNS must be properly configured on all hosts. This ensures that each host can resolve its IP address back to its FQDN.

    • Resolution: One potential cause for this issue is that the HOST has multiple hostnames. The hostname provided in the Service Principle in the Snap Kerberos configuration might not match the hostname resolved on the Namenode or DataNode.

      In the error message, the server's service principle is displayed. Ensure that the same service principle is provided in the Snap Kerberos Configuration.