I\'m attempting to run a pyspark script on BigInsights on Cloud 4.2 Enterprise that accesses a Hive table.
First I create the hive table:
[biadmin@bi4c-x
The solution to this error was to add the jars:
[biadmin@bi4c-xxxxxx-mastermanager ~]$ spark-submit \
--master yarn-cluster \
--deploy-mode cluster \
--jars /usr/iop/4.2.0.0/hive/lib/datanucleus-api-jdo-3.2.6.jar, \
/usr/iop/4.2.0.0/hive/lib/datanucleus-core-3.2.10.jar, \
/usr/iop/4.2.0.0/hive/lib/datanucleus-rdbms-3.2.9.jar \
test_pokes.py
However, I then get a different error:
pyspark.sql.utils.AnalysisException: u'Table not found: pokes; line 1 pos 14'
I've added the other question here: Spark Hive reporting pyspark.sql.utils.AnalysisException: u'Table not found: XXX' when run on yarn cluster
The final solution is captured here: https://stackoverflow.com/a/41272260/1033422