How to access external property file in spark-submit job?

前端 未结 2 507
有刺的猬
有刺的猬 2021-01-24 02:37

I am using spark 2.4.1 version and java8. I am trying to load external property file while submitting my spark job using spark-submit.

As I am using below TypeSafe to lo

2条回答
  •  情深已故
    2021-01-24 03:24

    The proper way to list files for the --files, --jars and other similar arguments is via a comma without any spaces (this is a crucial thing, and you see the exception about invalid main class precisely because of this):

    --files /local/apps/log4j.properties,/local/apps/applicationNew.properties
    

    If file names themselves have spaces in it, you should use quotes to escape these spaces:

    --files "/some/path with/spaces.properties,/another path with/spaces.properties"
    

    Another issue is that you specify the same property twice:

    ...
    --conf spark.driver.extraJavaOptions=-Dconfig.file=./applicationNew.properties \
    ...
    --conf spark.driver.extraJavaOptions=-Dlog4j.configuration=log4j.properties \
    ...
    

    There is no way for spark-submit to know how to merge these values, therefore only one of them is used. This is the reason why you see null for the config.file system property: it's just the second --conf argument takes priority and overrides the extraJavaOptions property with a single path to the log4j config file. Thus, the correct way is to specify all these values as one property:

    --conf spark.driver.extraJavaOptions="-Dlog4j.configuration=file:./log4j.properties -Dconfig.file=./applicationNew.properties"
    

    Note that because of quotes, the entire spark.driver.extraJavaOptions="..." is one command line argument rather than several, which is very important for spark-submit to pass these arguments to the driver/executor JVM correctly.

    (I also changed the log4j.properties file to use a proper URI instead of a file. I recall that without this path being a URI it might not work, but you can try either way and check for sure.)

提交回复
热议问题