datastax-enterprise

Can I use regular Spark Core application library to talk to DSE Spark?

梦想与她 提交于 2020-01-25 21:09:13
问题 A couple of days ago we upgraded from DSE 4.6 to DSE 4.7. We're running Spark jobs from Java so I've upgraded spark-core_2.10 Maven dependency from 0.9.1 to 1.2.2 to support the newer Spark 1.2.2 that DSE bundles with. However, when I'm now submitting jobs to master it logs ERROR [sparkMaster-akka.actor.default-dispatcher-2] 2015-11-17 17:50:42,442 Slf4jLogger.scala:66 - org.apache.spark.deploy.ApplicationDescription; local class incompatible: stream classdesc serialVersionUID =

How do I disable autocompaction in `cassandra.yaml`?

给你一囗甜甜゛ 提交于 2020-01-25 09:41:08
问题 https://stackoverflow.com/a/47837940/260805 hints that it should be possible. I would like to disable it for a longer period of times (~2 days) when enabling incremental repairs. 回答1: (Disclaimer: I'm a ScyllaDB employee) As far as I know you can disable autocompaction in the following ways: For a column family (Table), by setting its strategy to NullCompactionStrategy. (I think this one is supported only in Scylla, but not in Cassandra) Using nodetool: $ nodetool <options>

How to control processing of spark-stream while there is no data in Kafka topic

此生再无相见时 提交于 2020-01-25 06:48:50
问题 I am using spark-sql 2.4.1 , spark-cassandra-connector_2.11-2.4.1.jar and java8. I have cassandra table like this: CREATE company(company_id int, start_date date, company_name text, PRIMARY_KEY (company_id, start_date)) WITH CLUSTERING ORDER BY (start_date DESC); The field start_date here is a derived field, which is calculated in the business logic. I have spark-sql streaming code in which I call below mapFunction. public static MapFunction<Company, CompanyTransformed> mapFunInsertCompany =

What is best approach to join data in spark streaming application?

我怕爱的太早我们不能终老 提交于 2020-01-23 17:19:37
问题 Question : Essentially it means , rather than running a join of C* table for each streaming records , is there anyway to run a join for each micro-batch ( micro-batching ) of records in spark streaming ? We are almost finalized to use spark-sql 2.4.x version , datastax-spark-cassandra-connector for Cassandra-3.x version. But have one fundamental question regarding the efficiency in the below scenario. For the streaming data records(i.e. streamingDataSet ) , I need to look up for existing

Solr query issue with Faceting and Stats in dse

独自空忆成欢 提交于 2020-01-16 18:02:06
问题 Query : http://localhost:8983/solr/trackfleet_db.location/select?q=*:*&facet=true&facet.pivot={!stats=piv1}date,latitude,longitude&stats=true&stats.field={!tag=piv1}gpsdt When I execute this query on a separate solr instance (which is not an instance of DSE) then this query works fine. But in case of dse (Now I am using in built Solr of DSE) then it does not return anything ....And when I execute this query using curl command then it is giving following error <?xml version="1.0" encoding="UTF

Stress Cassandra instance on EC2 from local

时光怂恿深爱的人放手 提交于 2020-01-14 05:55:07
问题 I would appreciate some help on how to stress a Cassandra instance running on EC2 from my local machine (using cassandra-stress util). Cluster on EC2: Five nodes running DSE 4.6. Local machine: cassandra-stress as included in Cassandra 2.1.2. After changing the Security Group the stress util invoked from my local machine is able to connect to the given instance on EC2. I allowed inbound TCP connections on Ports 9160 and 9042 from my local machine's IP. sh cassandra-stress write -node 54.xxx

Handling Exceptions for Deferred Tasks in Cassandra

久未见 提交于 2020-01-07 06:40:57
问题 I've looked numerous posts on task exception handling and I'm not clear how to resolve this issue. My application crashes anytime there is an exception in one of my tasks. I am unable to catch the exception and my application is left in an inoperable state. UPDATE: This only seems to happen when calling the ExecuteAsync method in the DataStax Cassandra C# driver. This leads me to believe it's an issue in the driver itself. When I create my own task and throw an exception it works fine. Most

dse-driver connection refused

◇◆丶佛笑我妖孽 提交于 2020-01-07 06:14:15
问题 I am trying to connect to my Datastax enterprise Cassandra install on a server. When I try to connect I receive an error: Cassandra connection error { [Error: All host(s) tried for query failed. First host tried, XX.XX.XX.XX:9042: Error: connect ECONNREFUSED XX.XX.XX.XX:9042. See innerErrors.] innerErrors: { 'XX.XX.XX.XX:9042': { Error: connect ECONNREFUSED XX.XX.XX.XX:9042 at Object.exports._errnoException (util.js:896:11) at exports._exceptionWithHostPort (util.js:919:20) at TCPConnectWrap

Exception in thread “main” java.lang.NoClassDefFoundError: com/twitter/chill/KryoBase

戏子无情 提交于 2020-01-05 07:58:08
问题 I am writing a simple spark-cassandra program in java with datastax cassandra but getting below exception Exception in thread "main" java.lang.NoClassDefFoundError: com/twitter/chill/KryoBase Caused by: java.lang.ClassNotFoundException: com.twitter.chill.KryoBase pom.xml <dependency> <groupId>com.datastax.dse</groupId> <artifactId>dse-spark-dependencies</artifactId> <version>5.1.1</version> <exclusions> <exclusion> <groupId>com.datastax.dse</groupId> <artifactId>dse-java-driver-core<

All masters are unresponsive ! ? Spark master is not responding with datastax architecture

核能气质少年 提交于 2020-01-03 04:35:10
问题 Tried using both Spark shell and Spark submit, getting this exception? Initializing SparkContext with MASTER: spark://1.2.3.4:7077 ERROR 2015-06-11 14:08:29 org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up. WARN 2015-06-11 14:08:29 org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend: Application ID is not initialized yet. ERROR 2015-06-11 14:08:30 org.apache.spark.scheduler.TaskSchedulerImpl