I am developing a Spark application that listens to a Kafka stream using Spark and Java.
I use kafka_2.10-0.10.2.1.
I have set various parameters for Kafka prope
You can use shade plugin to generate fat jars if that is prefered to you wrt. what Jacek proposed via --packages approach.
org.apache.maven.plugins
maven-shade-plugin
3.2.3
package
shade
You can also use maven-dependency plugin to fetch some of the dependencies and put it in your assembly in lib directory and latter supply it to spark.
org.apache.maven.plugins
maven-dependency-plugin
3.1.2
copy
initialize
copy
org.apache.logging.log4j
log4j-core
${log4j2.version}
jar
true
${project.build.directory}/log4j-v2-jars
log4j-v2-core.jar
org.apache.logging.log4j
log4j-api
${log4j2.version}
jar
true
${project.build.directory}/log4j-v2-jars
log4j-v2-api.jar
org.apache.logging.log4j
log4j-1.2-api
${log4j2.version}
jar
true
${project.build.directory}/log4j-v2-jars
log4j-v2-1.2-api.jar
org.apache.logging.log4j
log4j-slf4j-impl
${log4j2.version}
jar
true
${project.build.directory}/log4j-v2-jars
log4j-v2-slf4j-impl.jar
${project.build.directory}/wars
false
true
The reason why I am proposing this is because maybe in your case (as it was the case with my work) your cluster is behind a very strict firewall and spark is not allowed to talk to nexus for resolving packages at submit step. In that case you really do need to handle this at artefact preparation and either of these might help you.
In my example with maven-dependency I fetch log4jv2 to pass it to spark 2.3 in order to have log4j-v2 log outputs (you can place your dependencies instead).