hive spark yarn-cluster job fails with: “ClassNotFoundException: org.datanucleus.api.jdo.JDOPersistenceManagerFactory”

前端 未结 1 1680
Happy的楠姐
Happy的楠姐 2021-01-24 09:50

I\'m attempting to run a pyspark script on BigInsights on Cloud 4.2 Enterprise that accesses a Hive table.

First I create the hive table:

[biadmin@bi4c-x         


        
1条回答
  •  青春惊慌失措
    2021-01-24 10:22

    The solution to this error was to add the jars:

    [biadmin@bi4c-xxxxxx-mastermanager ~]$ spark-submit \
        --master yarn-cluster \
        --deploy-mode cluster \
        --jars /usr/iop/4.2.0.0/hive/lib/datanucleus-api-jdo-3.2.6.jar, \
               /usr/iop/4.2.0.0/hive/lib/datanucleus-core-3.2.10.jar, \
               /usr/iop/4.2.0.0/hive/lib/datanucleus-rdbms-3.2.9.jar \
        test_pokes.py
    

    However, I then get a different error:

    pyspark.sql.utils.AnalysisException: u'Table not found: pokes; line 1 pos 14'
    

    I've added the other question here: Spark Hive reporting pyspark.sql.utils.AnalysisException: u'Table not found: XXX' when run on yarn cluster

    The final solution is captured here: https://stackoverflow.com/a/41272260/1033422

    0 讨论(0)
提交回复
热议问题