Cannot connect locally to hdfs kerberized cluster using IntelliJ

为君一笑 提交于 2019-12-11 06:50:26

问题


Iam trying to connect to hdfs locally via intelliJ installed on my laptop.The cluster I'am trying to connect to is Kerberized with an edge node. I generated a keytab for the edge node and configured that in the code below. Iam able to login to the edgenode now. But when I now try to access the hdfs data which is on the namenode it throws an error. Below is the Scala code that is trying to connect to hdfs:

import org.apache.spark.sql.SparkSession
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.{FileSystem, Path}
import org.apache.hadoop.security.{Credentials, UserGroupInformation}
import org.apache.hadoop.security.token.{Token, TokenIdentifier}
import java.security.{AccessController, PrivilegedAction, PrivilegedExceptionAction}
import java.io.PrintWriter

object DataframeEx {
  def main(args: Array[String]) {
    // $example on:init_session$
    val spark = SparkSession
      .builder()
      .master(master="local")
      .appName("Spark SQL basic example")
      .config("spark.some.config.option", "some-value")
      .getOrCreate()

    runHdfsConnect(spark)

    spark.stop()
  }

   def runHdfsConnect(spark: SparkSession): Unit = {

    System.setProperty("HADOOP_USER_NAME", "m12345")
    val path = new Path("/data/interim/modeled/abcdef")
    val conf = new Configuration()
    conf.set("fs.defaultFS", "hdfs://namenodename.hugh.com:8020")
    conf.set("hadoop.security.authentication", "kerberos")
    conf.set("dfs.namenode.kerberos.principal.pattern","hdfs/_HOST@HUGH.COM")

    UserGroupInformation.setConfiguration(conf);
    val ugi=UserGroupInformation.loginUserFromKeytabAndReturnUGI("m12345@HUGH.COM","C:\\Users\\m12345\\Downloads\\m12345.keytab");

    println(UserGroupInformation.isSecurityEnabled())
     ugi.doAs(new PrivilegedExceptionAction[String] {
       override def run(): String = {
         val fs= FileSystem.get(conf)
         val output = fs.create(path)
         val writer = new PrintWriter(output)
         try {
           writer.write("this is a test")
           writer.write("\n")
         }
         finally {
           writer.close()
           println("Closed!")
         }
          "done"
       }
     })
  }
}

Iam able to log into the edgenode. But when Iam trying to write to hdfs (the doAs method) it throws the following error:

WARN Client: Exception encountered while connecting to the server : java.lang.IllegalArgumentException: Server has invalid Kerberos principal: hdfs/namenodename.hugh.com@HUGH.COM
18/06/11 12:12:01 ERROR UserGroupInformation: PriviledgedActionException m12345@HUGH.COM (auth:KERBEROS) cause:java.io.IOException: java.lang.IllegalArgumentException: Server has invalid Kerberos principal: hdfs/namenodename.hugh.com@HUGH.COM
18/06/11 12:12:01 ERROR UserGroupInformation: PriviledgedActionException as:m12345@HUGH.COM (auth:KERBEROS) cause:java.io.IOException: Failed on local exception: java.io.IOException: java.lang.IllegalArgumentException: Server has invalid Kerberos principal: hdfs/namenodename.hugh.com@HUGH.COM; Host Details : local host is: "INMBP-m12345/172.29.155.52"; destination host is: "namenodename.hugh.com":8020; 
Exception in thread "main" java.io.IOException: Failed on local exception: java.io.IOException: java.lang.IllegalArgumentException: Server has invalid Kerberos principal: hdfs/namenodename.hugh.com@HUGH.COM; Host Details : local host is: "INMBP-m12345/172.29.155.52"; destination host is: "namenodename.hugh.com":8020

If I log into the edgenode and do a kinit and then access the hdfs its fine. So why am I not able to access the hdfs namenode when Iam able to log into the edgenode?

Let me know if any more details are needed from my side.


回答1:


The Spark conf object was set incorrectly. Below is what worked for me:

val conf = new Configuration()
conf.set("fs.defaultFS", "hdfs://namenodename.hugh.com:8020")
conf.set("hadoop.security.authentication", "kerberos")
conf.set("hadoop.rpc.protection", "privacy")   ***---(was missing this parameter)***
conf.set("dfs.namenode.kerberos.principal","hdfs/_HOST@HUGH.COM") ***---(this was initially wrongly set as dfs.namenode.kerberos.principal.pattern)***


来源:https://stackoverflow.com/questions/50951656/cannot-connect-locally-to-hdfs-kerberized-cluster-using-intellij

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!