datastax-enterprise

Opscenter backup to S3 location fails

故事扮演 提交于 2019-12-08 03:30:06
问题 Using OpsCenter 5.1.1, Datastax Enterprise 4.5.1, 3 node cluster in AWS. I set up a scheduled backup to local server and also to bucket in S3. The On Server backup finished successfully on all 3 nodes. The S3 backup runs slowly and fails on all 3 nodes. Some keyspaces are backed up, files are created in the S3 bucket. It appears that not all tables are backed up. Looking at /var/log/opscenter/opscenterd.log, I see an OOM error. Why should there be an out-of-memory error when writing to S3

Cqlsh with client to node SSL encryption

拥有回忆 提交于 2019-12-08 03:18:55
问题 Am trying to enable client to node SSL encryption in my DSE server. My cqlshrc file looks like below [connection] hostname = 127.0.0.1 port = 9160 factory = cqlshlib.ssl.ssl_transport_factory [ssl] certfile = /path/to/dse_node0.cer validate = true ;; Optional, true by default. [certfiles] ;; Optional section, overrides the default certfile in the [ssl] section. 1.2.3.4 = /path/to/dse_node0.cer When I tried to login into cqlsh shell then am getting the below error Connection error: Could not

DSE - Cassandra : Commit Log Disk Impact on Performances

不羁的心 提交于 2019-12-08 03:06:28
I'm running a DSE 4.6.5 Cluster (Cassandra 2.0.14.352). Following datastax's guidelines, on every machine, I separated the data directory from the commitlog/saved caches directories: data is on blazing fast drives commit log and saved caches are on the system drives : 2 HDD RAID1 Monitoring disks with OpsCenter while performing intensive writes, I see no issue with the first, however I see the queue size from the later (commit log) averaging around 300 to 400 with spikes up to 700 requests. Of course the latency is also fairly high on theses drives ... Is this affecting, the performance of my

Performance of token range based queries on partition keys?

南笙酒味 提交于 2019-12-07 16:29:21
问题 I am selecting all records from cassandra nodes based on token range of my partition key. Below is the code: public static synchronized List<Object[]> getTokenRanges( final Session session) { if (cluster == null) { cluster = session.getCluster(); } Metadata metadata = cluster.getMetadata(); return unwrapTokenRanges(metadata.getTokenRanges()); } private static List<Object[]> unwrapTokenRanges(Set<TokenRange> wrappedRanges) { final int tokensSize = 2; List<Object[]> tokenRanges = new ArrayList<

Cassandra Allow filtering

倾然丶 夕夏残阳落幕 提交于 2019-12-07 02:23:13
问题 I have a table as below CREATE TABLE test ( day int, id varchar, start int, action varchar, PRIMARY KEY((day),start,id) ); I want to run this query Select * from test where day=1 and start > 1475485412 and start < 1485785654 and action='accept' ALLOW FILTERING Is this ALLOW FILTERING efficient? I am expecting that cassandra will filter in this order 1. By Partitioning column(day) 2. By the range column(start) on the 1's result 3. By action column on 2's result. So the allow filtering will not

data modeling in Cassandra with columns that can be text or numbers

百般思念 提交于 2019-12-06 21:25:27
I have table with 5 columns. 1. ID - number but it can stored as text or number 2. name - text 3. date - date value but can stored as date or text 4. time - number but it can stored as text or number 5. rating - number but it can stored as text or number I want to find which data type will make my table faster for write. How can I find. Any Cassandra stress yaml for this there? Brice Regarding answer that @BryceAtNetwork23 provided, it will be the same with Cassandra 2.1 or in Cassandra 2.2 (but Cassandra 3.0 will probably be a different story as the team is currently rewriting the storage

Normal Query on Cassandra using DataStax Enterprise works, but not solr_query

£可爱£侵袭症+ 提交于 2019-12-06 16:42:25
I am having a strange issue occur while utilizing the solr_query handler to make queries in Cassandra on my terminal. When I perform normal queries on my table, I am having no issues, but when I use solr_query I get the following error: Unable to complete request: one or more nodes were unavailable. Other individuals who have experienced this problem seem unable to do any queries on their data whatsoever, whether or not it is solr_query. My problem only persists while using that handler. Can anyone give me a suggestion for what the issue may be with my solr node. ALSO -- I can do queries off

Two node DSE spark cluster error setting up second node. Why?

醉酒当歌 提交于 2019-12-06 15:33:15
I have DSE spark cluster with 2 nodes. One DSE analytics node with spark cannot start after I install it. Without spark it starts just fine. But on my other node spark is enabled and it can start and works just fine. Why is that and how can I solve that? Thanks. Here is my error log: ERROR [main] 2016-02-27 20:35:43,353 CassandraDaemon.java:294 - Fatal exception during initialization org.apache.cassandra.exceptions.ConfigurationException: Cannot start node if snitch's data center (Analytics) differs from previous data center (Cassandra). Please fix the snitch configuration, decommission and

Solr docValues usage

无人久伴 提交于 2019-12-06 12:58:20
问题 I am planning to try Solr's docValues to hopefully improve facet and sort performance. I have some questions around this feature: If I enable docValues, will Solr create a forward index (for faceting) in addition to a separate reverse index (for searching)? Or will Solr simply create a forward index ONLY? (thus, resulting to performance gain in faceting in exchange for performance loss in searching) If I want to both facet and search in a single field, what is the best practice? Should I set

How to test a Spark SQL Query without Scala

萝らか妹 提交于 2019-12-06 10:23:36
I am trying to figure out how to test Spark SQL queries against a Cassandra database -- kind of like you would in SQL Server Management Studio. Currently I have to open the Spark Console and type Scala commands which is really tedious and error prone. Something like: scala > var query = csc.sql("select * from users"); scala > query.collect().foreach(println) Especially with longer queries this can be a real pain. This seems like a terribly inefficient way to test if your query is correct and what data you will get back. The other issue is when your query is wrong you get back a mile long error