datastax-enterprise | 易学教程

Opscenter backup to S3 location fails

阅读更多关于 Opscenter backup to S3 location fails

问题 Using OpsCenter 5.1.1, Datastax Enterprise 4.5.1, 3 node cluster in AWS. I set up a scheduled backup to local server and also to bucket in S3. The On Server backup finished successfully on all 3 nodes. The S3 backup runs slowly and fails on all 3 nodes. Some keyspaces are backed up, files are created in the S3 bucket. It appears that not all tables are backed up. Looking at /var/log/opscenter/opscenterd.log, I see an OOM error. Why should there be an out-of-memory error when writing to S3

Cqlsh with client to node SSL encryption

阅读更多关于 Cqlsh with client to node SSL encryption

问题 Am trying to enable client to node SSL encryption in my DSE server. My cqlshrc file looks like below [connection] hostname = 127.0.0.1 port = 9160 factory = cqlshlib.ssl.ssl_transport_factory [ssl] certfile = /path/to/dse_node0.cer validate = true ;; Optional, true by default. [certfiles] ;; Optional section, overrides the default certfile in the [ssl] section. 1.2.3.4 = /path/to/dse_node0.cer When I tried to login into cqlsh shell then am getting the below error Connection error: Could not

DSE - Cassandra : Commit Log Disk Impact on Performances

阅读更多关于 DSE - Cassandra : Commit Log Disk Impact on Performances

I'm running a DSE 4.6.5 Cluster (Cassandra 2.0.14.352). Following datastax's guidelines, on every machine, I separated the data directory from the commitlog/saved caches directories: data is on blazing fast drives commit log and saved caches are on the system drives : 2 HDD RAID1 Monitoring disks with OpsCenter while performing intensive writes, I see no issue with the first, however I see the queue size from the later (commit log) averaging around 300 to 400 with spikes up to 700 requests. Of course the latency is also fairly high on theses drives ... Is this affecting, the performance of my

Performance of token range based queries on partition keys?

阅读更多关于 Performance of token range based queries on partition keys?

问题 I am selecting all records from cassandra nodes based on token range of my partition key. Below is the code: public static synchronized List<Object[]> getTokenRanges( final Session session) { if (cluster == null) { cluster = session.getCluster(); } Metadata metadata = cluster.getMetadata(); return unwrapTokenRanges(metadata.getTokenRanges()); } private static List<Object[]> unwrapTokenRanges(Set<TokenRange> wrappedRanges) { final int tokensSize = 2; List<Object[]> tokenRanges = new ArrayList<

Cassandra Allow filtering

阅读更多关于 Cassandra Allow filtering

问题 I have a table as below CREATE TABLE test ( day int, id varchar, start int, action varchar, PRIMARY KEY((day),start,id) ); I want to run this query Select * from test where day=1 and start > 1475485412 and start < 1485785654 and action='accept' ALLOW FILTERING Is this ALLOW FILTERING efficient? I am expecting that cassandra will filter in this order 1. By Partitioning column(day) 2. By the range column(start) on the 1's result 3. By action column on 2's result. So the allow filtering will not

data modeling in Cassandra with columns that can be text or numbers

阅读更多关于 data modeling in Cassandra with columns that can be text or numbers

I have table with 5 columns. 1. ID - number but it can stored as text or number 2. name - text 3. date - date value but can stored as date or text 4. time - number but it can stored as text or number 5. rating - number but it can stored as text or number I want to find which data type will make my table faster for write. How can I find. Any Cassandra stress yaml for this there? Brice Regarding answer that @BryceAtNetwork23 provided, it will be the same with Cassandra 2.1 or in Cassandra 2.2 (but Cassandra 3.0 will probably be a different story as the team is currently rewriting the storage

Normal Query on Cassandra using DataStax Enterprise works, but not solr_query

阅读更多关于 Normal Query on Cassandra using DataStax Enterprise works, but not solr_query

I am having a strange issue occur while utilizing the solr_query handler to make queries in Cassandra on my terminal. When I perform normal queries on my table, I am having no issues, but when I use solr_query I get the following error: Unable to complete request: one or more nodes were unavailable. Other individuals who have experienced this problem seem unable to do any queries on their data whatsoever, whether or not it is solr_query. My problem only persists while using that handler. Can anyone give me a suggestion for what the issue may be with my solr node. ALSO -- I can do queries off

Two node DSE spark cluster error setting up second node. Why?

阅读更多关于 Two node DSE spark cluster error setting up second node. Why?

I have DSE spark cluster with 2 nodes. One DSE analytics node with spark cannot start after I install it. Without spark it starts just fine. But on my other node spark is enabled and it can start and works just fine. Why is that and how can I solve that? Thanks. Here is my error log: ERROR [main] 2016-02-27 20:35:43,353 CassandraDaemon.java:294 - Fatal exception during initialization org.apache.cassandra.exceptions.ConfigurationException: Cannot start node if snitch's data center (Analytics) differs from previous data center (Cassandra). Please fix the snitch configuration, decommission and

Solr docValues usage

阅读更多关于 Solr docValues usage

问题 I am planning to try Solr's docValues to hopefully improve facet and sort performance. I have some questions around this feature: If I enable docValues, will Solr create a forward index (for faceting) in addition to a separate reverse index (for searching)? Or will Solr simply create a forward index ONLY? (thus, resulting to performance gain in faceting in exchange for performance loss in searching) If I want to both facet and search in a single field, what is the best practice? Should I set

How to test a Spark SQL Query without Scala

阅读更多关于 How to test a Spark SQL Query without Scala

I am trying to figure out how to test Spark SQL queries against a Cassandra database -- kind of like you would in SQL Server Management Studio. Currently I have to open the Spark Console and type Scala commands which is really tedious and error prone. Something like: scala > var query = csc.sql("select * from users"); scala > query.collect().foreach(println) Especially with longer queries this can be a real pain. This seems like a terribly inefficient way to test if your query is correct and what data you will get back. The other issue is when your query is wrong you get back a mile long error