Spark error with google/guava library: java.lang.NoSuchMethodError: com.google.common.cache.CacheBuilder.refreshAfterWrite

人盡茶涼 提交于 2021-02-08 03:11:10

问题


I have a simple spark project - in which in the pom.xml the dependencies are only the basic scala, scalatest/junit, and spark:

    <dependency>
        <groupId>net.alchim31.maven</groupId>
        <artifactId>scala-maven-plugin</artifactId>
        <version>3.2.0</version>
    </dependency>
    <dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-library</artifactId>
        <version>${scala.version}</version>
    </dependency>
    <dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-compiler</artifactId>
        <version>${scala.version}</version>
    </dependency>
    <dependency>
        <groupId>junit</groupId>
        <artifactId>junit</artifactId>
        <version>4.11</version>
        <scope>test</scope>
    </dependency>
    <dependency>
        <groupId>org.scalatest</groupId>
        <artifactId>scalatest_${scala.binary.version}</artifactId>
        <version>3.0.1</version>
        <scope>test</scope>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_${scala.binary.version}</artifactId>
        <version>${spark.version}</version>
        <scope>compile</scope>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-mllib_${scala.binary.version}</artifactId>
        <version>${spark.version}</version>
        <scope>compile</scope>
    </dependency>
</dependencies>

When attempting to run a basic spark program the SparkSession init fails on this line:

 SparkSession.builder.master(master).appName("sparkApp").getOrCreate

Here is the output / error:

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
18/04/07 18:06:15 INFO SparkContext: Running Spark version 2.2.1
Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.cache.CacheBuilder
.refreshAfterWrite(JLjava/util/concurrent/TimeUnit;)
Lcom/google/common/cache/CacheBuilder;
    at org.apache.hadoop.security.Groups.<init>(Groups.java:96)
    at org.apache.hadoop.security.Groups.<init>(Groups.java:73)

at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:293)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:283)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:260)
at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:789)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:774)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:647)
at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2424)
at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2424)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2424)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:295)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2516)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:918)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:910)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:910)

I have run spark locally many dozens of times on other projects, what might be wrong with this simple one? Is there a dependency on $HADOOP_HOME environment variable or similar?

Update By downgrading the spark version to 2.0.1 I was able to compile. That does not fix the problem (we need newer) version. But it helps point out the source of the problem

Another update In a different project the hack to downgrade to 2.0.1 does help - i.e. execution proceeds further : but then when writing out to parquet a similar exception does happen.

8/05/07 11:26:11 ERROR Executor: Exception in task 0.0 in stage 2741.0 (TID 2618)
java.lang.NoSuchMethodError: com.google.common.cache.CacheBuilder.build(Lcom/google/common/cache/CacheLoader;)Lcom/google/common/cache/LoadingCache;
    at org.apache.hadoop.io.compress.CodecPool.createCache(CodecPool.java:62)
    at org.apache.hadoop.io.compress.CodecPool.<clinit>(CodecPool.java:74)
    at org.apache.parquet.hadoop.CodecFactory$BytesCompressor.<init>(CodecFactory.java:92)
    at org.apache.parquet.hadoop.CodecFactory.getCompressor(CodecFactory.java:169)
    at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:303)
    at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:262)
    at org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.<init>(ParquetFileFormat.scala:562)
    at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anon$1.newInstance(ParquetFileFormat.scala:139)
    at org.apache.spark.sql.execution.datasources.BaseWriterContainer.newOutputWriter(WriterContainer.scala:131)
    at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:247)
    at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
    at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
    at org.apache.spark.scheduler.Task.run(Task.scala:86)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

回答1:


This error occurs due to version mismatch between Google's guava library and Spark. Spark shades guava but many libraries use guava. You can try Shading the Guava dependencies as per this post. Apache-Spark-User-List




回答2:


I am getting this error in spring boot : java.lang.TypeNotPresentException: Type com.google.common.cache.CacheBuilderSpec com.google.common.cache.CacheBuilder.build()Lcom/google/common/cache/Cache

The issue is due to "com.google.guava:guava" api. In springboot this api comes under some other api might be "spring-boot-starter-web" or "springfox-swagger2" api so we need to first exclude guava api from springfox-swagger2 jar and need to add updated version of guava api.spring-data-mongodb

Solution: 1. add guava dependency on the top of all the dependency so that springboot can ge the latest version:

<dependency>
    <groupId>com.google.guava</groupId>
    <artifactId>guava</artifactId>
    <version>19.0</version>
</dependency>
  1. Find out the spring boot dependecy where artifactId: "guava" included then exlude "guava" artifact from that dependency and then add the guava dependency like above.



回答3:


Adding shade plugin to your pom file and relocating google package can resolve this issue.

More information can found here and here

  <plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-shade-plugin</artifactId>
    <version>3.2.1</version>
    <executions>
      <execution>
        <phase>package</phase>
        <goals>
          <goal>shade</goal>
        </goals>
        <configuration>
            <relocations>
                <relocation>
                    <pattern>com.google.common</pattern>
                    <shadedPattern>shade.com.google.common</shadedPattern>
                </relocation>
          </relocations>
        </configuration>
      </execution>
    </executions>
  </plugin>

If this also doesn't help then adding guava library of version 15.0 works nicely. The reason of this work around is in dependencyManagement. The nice SO answer is here

<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>com.google.guava</groupId>
            <artifactId>guava</artifactId>
            <version>15.0</version>
        </dependency>
    </dependencies>
</dependencyManagement>


来源:https://stackoverflow.com/questions/49713485/spark-error-with-google-guava-library-java-lang-nosuchmethoderror-com-google-c

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!