Setting spark classpaths on EC2: spark.driver.extraClassPath and spark.executor.extraClassPath

后端 未结 2 738
佛祖请我去吃肉
佛祖请我去吃肉 2021-02-20 12:25

Reducing size of application jar by providing spark- classPath for maven dependencies:

My cluster is having 3 ec2 instances on which hadoop and spark i

2条回答
  •  不思量自难忘°
    2021-02-20 13:22

    Finally, I was able to solve the problem. I have created application jar using "mvn package" instead of "mvn clean compile assembly:single ",so that it will not download the maven dependencies while creating jar(But need to provide these jar/dependencies run-time) which resulted in small size Jar(as there is only reference of dependencies).

    Then, I have added below two parameters in spark-defaults.conf on each node as:

    spark.driver.extraClassPath     /home/spark/.m2/repository/com/datastax/cassandra/cassandra-driver-core/2.1.7/cassandra-driver-core-2.1.7.jar:/home/spark/.m2/repository/com/googlecode/json-simple/json-simple/1.1/json-simple-1.1.jar:/home/spark/.m2/repository/com/google/code/gson/gson/2.3.1/gson-2.3.1.jar:/home/spark/.m2/repository/com/google/guava/guava/16.0.1/guava-16.0.1.jar
    
    spark.executor.extraClassPath     /home/spark/.m2/repository/com/datastax/cassandra/cassandra-driver-core/2.1.7/cassandra-driver-core-2.1.7.jar:/home/spark/.m2/repository/com/googlecode/json-simple/json-simple/1.1/json-simple-1.1.jar:/home/spark/.m2/repository/com/google/code/gson/gson/2.3.1/gson-2.3.1.jar:/home/spark/.m2/repository/com/google/guava/guava/16.0.1/guava-16.0.1.jar
    

    So question arises that,how application JAR will get the maven dependencies(required jar's) run-time?

    For that I have downloaded all required dependencies on each node using mvn clean compile assembly:single in advance.

提交回复
热议问题