Spark-Cassandra Connector : Failed to open native connection to Cassandra

微笑、不失礼 提交于 2019-12-01 18:39:43

you did not specified spark.cassandra.connection.host by default spark assume that cassandra host is same as spark master node.

var sc:SparkContext=_
val conf = new SparkConf().setAppName("Cassandra Demo").setMaster(master)
.set("spark.cassandra.connection.host", "192.168.101.11")
c=new SparkContext(conf)

val rdd = sc.cassandraTable("test", "words")
rdd.toArray.foreach(println)

it should work if you have properly set seed nodein cassandra.yaml

I struggled with this issue overnight, and finally got a combination that works. I am writing it down for those who may run into similar issue.

First of all, this is a version issue cassandra-driver-core's dependency. But to track down the exact combination that works takes me quite a bit time.

Secondly, this is the combination that works for me.

  1. Spark 1.6.2 with Hadoop 2.6, cassandra 2.1.5 (Ubuntu 14.04, Java 1.8),
  2. In built.sbt (sbt assembly, scalaVersion := "2.10.5"), use

"com.datastax.spark" %% "spark-cassandra-connector" % "1.4.0", "com.datastax.cassandra" % "cassandra-driver-core" % "2.1.5"

Thirdly, let me clarify my frustrations. With spark-cassandra-connector 1.5.0, I can run the assembly with spark-submit with --master "local[2]" on the same machine with remote cassandra connection without any problem. Any combination of connector 1.5.0, 1.6.0 with Cassandra 2.0, 2.1, 2.2, 3,4 works well. But if I try to submit the job to a cluster from the same machine (NodeManager) with --master yarn --deploy-mode cluster, then I will always run into the problem: Failed to open native connection to Cassandra at {192.168.122.12}:9042

What is going on here? Any from DataStarX can take a look at this issue? I can only guess it has something to do with "cqlversion", which should match the version of Cassandra cluster.

Anybody know a better solution? [cassandra], [apache-spark]

The issue resolved. It was due to some mess up with the dependencies. I built a jar with dependencies and passed it to spark-submit, instead of specifying dependent jars separately.

It's worked finally :

steps :

  1. set listen_address to private IP of EC2 instance.
  2. do not set any broadcast_address
  3. set rpc_address to 0.0.0.0
  4. set broadcast_rpc_address to public ip of EC2 instance.

This is an issue with version of the cassandra-driver-core jar's dependency.

The provided cassandra's version is 2.0
The provided cassandra-driver-core jar's version is 2.1.5

The jar should be the same as the version of the cassandra running.

In this case, the included jar file should be cassandra-driver-core-2.0.0.jar
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!