问题
cassandra-connector-assembly-2.0.0
built from github
project.
with Scala 2.11.8
, cassandra-driver-core-3.1.0
sc.cassandraTable("mykeyspace", "mytable").select("something").where("key=?", key).mapPartitions(par => {
par.map({ row => (row.getString("something"), 1 ) })
})
.reduceByKey(_ + _).collect().foreach(println)
The same job works fine for reading less mass data
java.lang.NoSuchMethodError: com.datastax.driver.core.ResultSet.fetchMoreResults()Lshade/com/datastax/spark/connector/google/common/util/concurrent/ListenableFuture;
at com.datastax.spark.connector.rdd.reader.PrefetchingResultSetIterator.maybePrefetch(PrefetchingResultSetIterator.scala:26)
at com.datastax.spark.connector.rdd.reader.PrefetchingResultSetIterator.next(PrefetchingResultSetIterator.scala:39)
at com.datastax.spark.connector.rdd.reader.PrefetchingResultSetIterator.next(PrefetchingResultSetIterator.scala:17)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at scala.collection.Iterator$$anon$12.next(Iterator.scala:444)
at com.datastax.spark.connector.util.CountingIterator.next(CountingIterator.scala:16)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:194)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
at org.apache.spark.scheduler.Task.run(Task.scala:85)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Can any one suggest or point out to the issue, and a possible solution?
回答1:
It is a conflict with the Cassandra driver-core that
libraryDependencies += "com.datastax.spark" % "spark-cassandra-connector_2.11" % "2.0.0-M3"
brings in.
If you go into the ~/.ivy2/cache/com.datastax.spark/spark-cassandra-connector_2.11 you will find a file called ivy-2.0.0-M3.xml
In that file the dependency is
com.datastax.cassandra" name="cassandra-driver-core" rev="3.0.2" force="true"
Note that it is the 3.0.2 version of Cassandra driver core which gets overrun by the more recent one.
It just so happens that the latest source on Github does not show a implementation for fetchMoreResults which is inherited from interface PagingIterable
If you roll back the Git version to 3.0.x on Github, you'll find
public ListenableFuture<ResultSet> fetchMoreResults();
So it looks like the newest Cassandra core drivers were rushed out the door incomplete. Or I might be missing something. Hope this helps.
tl;dr; Remove the latest driver and use the one embedded in the spark cassandra connector.
回答2:
had the same problem
There were two dependencies in the project which both had cassandra-driver-core
as a dependency
spark-cassandra-connector_2.11-2.0.0-M3
&
job-server-api_2.10-0.8.0-SNAPSHOT
spark-cassandra-connecter expected ResultSet.fetchMoreResults
to have a different return type, due to its shading of guava
expected . shade.com.datastax.spark.connector.google.common.util.concurrent.ListenableFuture
found .
com.google.common.util.concurrent.ListenableFuture
switched to an unshaded version of cassandra-connector to correct the issue
回答3:
The problem is resolved by removing cassandra-driver-core-3.1.0-shaded.jar from spark/jars/
a topical java duplicated classes conflicted problem?!
Need to confirm all jars included, if there're any duplicated jars involved.
The solution mentioned above is only one of case.
回答4:
For all these problems run below command and check if there is any overlapping dependency exists-
mvn dependency:tree
来源:https://stackoverflow.com/questions/39034538/what-happens-nosuchmethoderror-com-datastax-driver-core-resultset-fetchmorere