How to resolve the conflict between 11.0.2 and 16.0 of guava when using yarn, spark and spark-cassandra-connector?

问题

my yarn's version is hadop-2.4.0.x, spark is spark-1.5.1-bin-hadoop2.4 and spark-cassandra-connector is spark-cassandra-connector_2.10-1.5.0-M2, when I executed the following command:

bin/spark-shell --driver-class-path $(echo lib/*.jar | sed 's/ /:/g')  --master yarn-client 
--deploy-mode client --conf spark.cassandra.connection.host=192.21.0.209 
--conf spark.cassandra.auth.username=username --conf spark.cassandra.auth.password=password --conf spark.sql.dialect=sql 
--jars lib/guava-16.0.jar,spark-cassandra-connector_2.10-1.5.0-M2.jar,lib/cassandra-driver-core-2.2.0-rc3.jar

After starting, I input the following scala under the prompt:

import org.apache.spark.sql.cassandra.CassandraSQLContext
import org.apache.spark.sql.{DataFrame, SaveMode}
import org.apache.spark.{Logging, SparkConf, SparkContext}
import org.joda.time.{DateTime, Days, LocalDate}
val cc = new CassandraSQLContext(sc)

val rdd: DataFrame = cc.sql("select user_id,tag_models,dmp_province," +
"zp_gender,zp_age,zp_edu,stg_stage,zp_income,type " +
"from user_center.users_test")

I got the classic error:

Caused by: java.lang.NoSuchMethodError:  
com.google.common.util.concurrent.Futures.withFallback
(Lcom/google/common/util/concurrent/ListenableFuture;
Lcom/google/common/util/concurrent/FutureFallback;
Ljava/util/concurrent/Executor;)
Lcom/google/common/util/concurrent/ListenableFuture;

After search this error in google and stackoverflower, I know that the conflict between the different versions of guava caused this error, and found hadoop 2.4 use guava-11.0.2 but spark-cassandra-connector_2.10-1.5.0-M2 use guava-16.0.1.

How to resolve this kind of error, any advice will be appreciated!

UPDATE

I am so sorry for testing long time!

Now, for spark-submit, I tested this resolution Making Hadoop 2.6 + Spark-Cassandra Driver Play Nice Together successfully under my test yarn cluster

回答1:

In Hadoop configuration, add the following property to your hadoop-env.sh

HADOOP_USER_CLASSPATH_FIRST=true

In Spark configurations there is also a property you could set it to true spark.driver.userClassPathFirst , but it still an experimental and used only in cluster mode (have a look at the spark documentation). Personally I have not tried this property but since it is presented in the documentation I thought it is worth to mention

来源：https://stackoverflow.com/questions/36004072/how-to-resolve-the-conflict-between-11-0-2-and-16-0-of-guava-when-using-yarn-sp

标签

apache-spark

yarn

spark-cassandra-connector