use an external library in pyspark job in a Spark cluster from google-dataproc

后端未结

关注

 2  1306

我在风中等你 2020-12-09 06:36

I have a spark cluster I created via google dataproc. I want to be able to use the csv library from databricks (see https://github.com/databricks/spark-csv). So I f

2条回答

臣服心动 (楼主)

2020-12-09 07:13
Additionally to @Dennis.

Note that if you need to load multiple external packages, you need to specify a custom escape character like so:
```
--properties ^#^spark.jars.packages=org.elasticsearch:elasticsearch-spark_2.10:2.3.2,com.data‌bricks:spark-avro_2.10:2.0.1
```
Note the ^#^ right before the package list. See gcloud topic escaping for more details.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...