How to load Spark Cassandra Connector in the shell?

前端 未结 6 1432
长情又很酷
长情又很酷 2020-12-07 16:16

I am trying to use Spark Cassandra Connector in Spark 1.1.0.

I have successfully built the jar file from the master branch on GitHub and have gotten the included dem

6条回答
  •  野趣味
    野趣味 (楼主)
    2020-12-07 16:40

    Edit: Things are a bit easier now

    For in-depth instructions check out the project website https://github.com/datastax/spark-cassandra-connector/blob/master/doc/13_spark_shell.md

    Or feel free to use Spark-Packages to load the Library (Not all versions published) http://spark-packages.org/package/datastax/spark-cassandra-connector

    > $SPARK_HOME/bin/spark-shell --packages com.datastax.spark:spark-cassandra-connector_2.10:1.4.0-M3-s_2.10
    

    The following assumes you are running with OSS Apache C*

    You'll want to start the class with the –driver-class-path set to include all your connector libs

    I'll quote a blog post from the illustrious Amy Tobey

    The easiest way I’ve found is to set the classpath with then restart the context in the REPL with the necessary classes imported to make sc.cassandraTable() visible. The newly loaded methods will not show up in tab completion. I don’t know why.

      /opt/spark/bin/spark-shell --driver-class-path $(echo /path/to/connector/*.jar |sed 's/ /:/g')
    

    It will print a bunch of log information then present scala> prompt.

    scala> sc.stop
    

    Now that the context is stopped, it’s time to import the connector.

    scala> import com.datastax.spark.connector._
    scala> val conf = new SparkConf()
    scala> conf.set("cassandra.connection.host", "node1.pc.datastax.com")
    scala> val sc = new SparkContext("local[2]", "Cassandra Connector Test", conf)
    scala> val table = sc.cassandraTable("keyspace", "table")
    scala> table.count
    

    If you are running with DSE < 4.5.1

    There is a slight issue with the DSE Classloader and previous package naming conventions that will prevent you from finding the new spark-connector libraries. You should be able to get around this by removing the line specifying the DSE Class loader in the scripts starting spark-shell.

提交回复
热议问题