datastax-enterprise | 易学教程

DSE Solr nodes and vnodes

阅读更多关于 DSE Solr nodes and vnodes

问题 The following documentation pages say that it is not recommended to use vnodes for Solr/Hadoop nodes: http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/srch/srchIntro.html http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/deploy/deployConfigRep.html#configReplication What is the exact problem with using vnodes for these node types? I inherited a DSE setup wherein the Search nodes all use vnodes, and I wonder if I should take down

How do I see SOLR dynamic fields in CQL with Cassandra?

阅读更多关于 How do I see SOLR dynamic fields in CQL with Cassandra?

问题 Solr dynamic fields appear as searchable in Solr and available in the Thrift interface, but when using CQL, the fields don't appear. Is there a specific search style or querying style that can be used to expose what the dynamic fields are and their values? 回答1: Through CQL3 Dynamic fields should work as well with a few caveats. You need to declare the type as a map (eg: dyn_ map) and create the CQL schema. Post your schema with the dynamic type declared. The dynamic part isn't inferred inside

Why not enable virtual node in an Hadoop node?

阅读更多关于 Why not enable virtual node in an Hadoop node?

问题 In url: http://www.datastax.com/docs/datastax_enterprise3.2/solutions/about_hadoop "Before starting an analytics/Hadoop node on a production cluster or data center, it is important to disable the virtual node configuration." What will happen if I enable virtual node in an analytics/Hadoop node? 回答1: If you enable virtual nodes on hadoop node, it will lower performance of small Hadoop jobs by raising the number of mappers to at least the number of virtual nodes. E.g. if you use the default 256

Pig & Cassandra & DataStax Splits Control

阅读更多关于 Pig & Cassandra & DataStax Splits Control

问题 I have been using Pig with my Cassandra data to do all kinds of amazing feats of groupings that would be almost impossible to write imperatively. I am using DataStax's integration of Hadoop & Cassandra, and I have to say it is quite impressive. Hat-off to those guys!! I have a pretty small sandbox cluster (2-nodes) where I am putting this system thru some tests. I have a CQL table that has ~53M rows (about 350 bytes ea.), and I notice that the Mapper later takes a very long time to grind thru

Starting cassandra as a service does not work for 2.0.5, sudo cassandra -f works

阅读更多关于 Starting cassandra as a service does not work for 2.0.5, sudo cassandra -f works

问题 When I try to start cassandra on ubuntu 12.04 (installed via Datastax's dsc20 package) as a service as follows : $ sudo service cassandra start it says *could not access pidfile for Cassandra & no other messages or anything in logs. But when I try to run as a root user( sudo cassandra -f ) it just works properly & cassandra is started. While trying to debug I found that when trying to run as a non-root user I was getting these messages: ERROR 17:48:08,432 Exception encountered during startup

how to return subgraph from gremlin that is in an easily consumable format for Java

阅读更多关于 how to return subgraph from gremlin that is in an easily consumable format for Java

问题 I am very frustrated about very simple things when I try to do a single traversal and bring a lot of stuff from DSE Graph 5.0 at once using Gremlin.. In my simplified case I have: 1 entity with specific uuid entity can have zero (see optional) or more types I need to be able to return the entity and the types What I have so far that works is very ugly :( List list = g.V().hasLabel("Entity").has("uuid","6708ec6d-4518-4159-9005-9e9d642f157e").as("entity") .optional(outE("IsOfType").as("types"))

How to load Spark Cassandra Connector in the shell?

阅读更多关于 How to load Spark Cassandra Connector in the shell?

问题 I am trying to use Spark Cassandra Connector in Spark 1.1.0. I have successfully built the jar file from the master branch on GitHub and have gotten the included demos to work. However, when I try to load the jar files into the spark-shell I can't import any of the classes from the com.datastax.spark.connector package. I have tried using the --jars option on spark-shell and adding the directory with the jar file to Java's CLASSPATH. Neither of these options work. In fact, when I use the -

how to download dse.jar

阅读更多关于 how to download dse.jar

问题 I am trying to use DataStax Enterprise 4.6 to write a Spark application in Java and run it in DSE's Spark analytics mode. The code for creating a Spark context using DSEConfHelper : SparkConf conf = DseSparkConfHelper.enrichSparkConf(new SparkConf()) .setAppName( "My application"); To use DSEConfHelper we need to import com.datastax.bdp.spark.DseSparkConfHelper which is located in dse.jar . In my pom.xml I have included the dependency: <dependency> <groupId>com.datastax</groupId> <artifactId

Cassandra SOLR Rolling Upgrade

阅读更多关于 Cassandra SOLR Rolling Upgrade

问题 We have a cluster of 12 nodes, 6 DSE-SOLR and 6 DSE-Cassandra. When upgrading from 3.0 to 3.1 we noticed that requests through the SOLR interface were broken until all nodes had been upgraded. Is this limitation still present when upgrading from 3.1 to 3.2? Are there any gotchas to note when making the upgrade? In the upgrade path docs it says to enable the old gossip protocol until all nodes have been upgraded, is this per DC or for the entire cluster? 回答1: Russ, What errors are you getting

Remote Spark Job fails: No assemblies found

阅读更多关于 Remote Spark Job fails: No assemblies found

问题 I am running a Spark job in a Vanilla Spark (Datastax) with this conf: val conf: SparkConf = new SparkConf() .setAppName("Fuzzy") .setMaster("spark://127.0.0.1:7077") .set("spark.cores.max", "2") .setJars(Seq("my-jar")) val sc: SparkContext = SparkContext.getOrCreate(conf) val NUM_SAMPLES: Int = 500 val join = sc.parallelize(1 to NUM_SAMPLES).filter { _ => val x = math.random val y = math.random x*x + y*y < 1 }.count() println(s"Pi is roughly ${4.0 * join / NUM_SAMPLES}") This is an Spark