How to resolve Guava dependency issue while submitting Uber Jar to Google Dataproc

后端未结

关注

 1  1682

I am using maven shade plugin to build Uber jar for submitting it as a job to google dataproc cluster. Google have installed Apache Spark 2.0.2 Apache Hadoop 2.7.3 on their

相关标签:

1条回答

被撕碎了的回忆

2020-12-12 02:51

Edited: See https://cloud.google.com/blog/products/data-analytics/managing-java-dependencies-apache-spark-applications-cloud-dataproc for a fully worked example for Maven and SBT.

Original Answer When I make uber jars to run on Hadoop / Spark / Dataproc, I often use whichever version of guava suits my needs and then use a shade relocation which allows the different versions to co-exist without issue:

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-shade-plugin</artifactId>
  <version>2.3</version>
  <executions>
    <execution>
      <phase>package</phase>
      <goals>
        <goal>shade</goal>
      </goals>
      <configuration>
      <artifactSet>
          <includes>
            <include>com.google.guava:*</include>
          </includes>
      </artifactSet>
      <minimizeJar>false</minimizeJar>
      <relocations>
          <relocation>
            <pattern>com.google.common</pattern>
            <shadedPattern>repackaged.com.google.common</shadedPattern>
          </relocation>
      </relocations>
      <shadedArtifactAttached>true</shadedArtifactAttached>
      </configuration>
  </execution>
</executions>
</plugin>

0 讨论(0)