datastax

How to keep 2 Cassandra tables within same partition

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-03 13:47:18
I tried reading up on datastax blogs and documentation but could not find any specific on this Is there a way to keep 2 tables in Cassandra to belong to same partition? For example: CREATE TYPE addr ( street_address1 text, city text, state text, country text, zip_code text, ); CREATE TABLE foo ( account_id timeuuid, data text, site_id int, PRIMARY KEY (account_id) }; CREATE TABLE bar ( account_id timeuuid, address_id int, address frozen<addr>, PRIMARY KEY (account_id, address_id) ); Here I need to ensure that both of these tables/CF will live on same partition that way for the same account_id

Max. size of wide rows?

故事扮演 提交于 2019-12-03 09:00:44
问题 Theoretically, Cassandra allows up to 2 billion columns in a wide row. I have heard that in reality up to 50.000 cols/50 MB are fine; 50.000-100.000 cols/100 MB are OK but require some tuning; and that one should never go above 100.000/100 MB columns per row. The reason being that this will put pressure on the heap. Is there some truth to this? 回答1: In Cassandra, the maximum number of cells (rows x columns) in a single partition is 2 billion. Additionally, a single column value may not be

What does rows_merged mean in compactionhistory?

自古美人都是妖i 提交于 2019-12-03 07:03:54
When I issue $ nodetool compactionhistory I get . . . compacted_at bytes_in bytes_out rows_merged . . . 1404936947592 8096 7211 {1:3, 3:1} What does {1:3, 3:1} mean? The only documentation I can find is this which states the number of partitions merged which does not explain why multiple values and what the colon means. So basically it means {tables:rows} for example {1:3, 3:1} means 3 rows were taken from one sstable (1:3) and 1 row taken from 3 (3:1) sstables, all to make the one sstable in that compaction operation. I tried it out myself so here's an example, I hope this helps: create

cassandra cql shell window got disappears after installation in windows

北城余情 提交于 2019-12-03 05:49:06
cassandra cql shell window got disappears after installation in windows? this was installed using MSI installer availalbe in planet cassandra. Why this happens ? please help me.. Thanks in advance. I had the same issue with DataStax 3.9. This is how I sorted this: Step 1: Open file: DataStax-DDC\apache-cassandra\conf\cassandra.yaml Step 2: Uncomment the cdc_raw_directory and set new value to (for windows) cdc_raw_directory: "C:/Program Files/DataStax-DDC/data/cdc_raw" Step 3: Goto Windows Services and Start the DataStax DDC Server 3.9.0 Service I had the same problem with DataStax Community 3

Max. size of wide rows?

无人久伴 提交于 2019-12-02 23:11:17
Theoretically, Cassandra allows up to 2 billion columns in a wide row. I have heard that in reality up to 50.000 cols/50 MB are fine; 50.000-100.000 cols/100 MB are OK but require some tuning; and that one should never go above 100.000/100 MB columns per row. The reason being that this will put pressure on the heap. Is there some truth to this? In Cassandra, the maximum number of cells (rows x columns) in a single partition is 2 billion . Additionally, a single column value may not be larger than 2GB, but in practice, "single digits of MB" is a more reasonable limit, since there is no

Error in .jfindClass(as.character(driverClass)[1]) : class not found

为君一笑 提交于 2019-12-02 21:22:21
问题 > cassdrv <- JDBC("org.apache.cassandra.cql.jdbc.CassandraDriver", + list.files("C://Users//VRavimurugan.GSIN//AppData//Roaming//RazorSQL//cassandra",pattern="jar$",full.names=T)) Error in .jfindClass(as.character(driverClass)1) : class not found Tried this , but no luck RJDBC Cassandra -> Error in .jfindClass(as.character(driverClass)[1]) : class not found 回答1: Just to note, the answer you linked to says to change the driver name to be "com.datastax.driver.jdbc.CassandraDriver" if you are

Cassandra batch query performance on tables having different partition keys

佐手、 提交于 2019-12-02 20:02:56
问题 I have test case in which I receive 150k requests per second from a client. My test case requires inserting UNLOGGED batch to multiple tables and having different partition keys BEGIN UNLOGGED BATCH update kspace.count_table set counter=counter+1 where source_id= 1 and name='source_name' and pname='Country' and ptype='text' and date='2017-03-20' and pvalue=textAsBlob('US') update kspace.count_table set counter=counter+1 where source_id= 1 and name='source_name' and pname='City' and ptype=

How can I improve the reducebykey part of my spark app?

随声附和 提交于 2019-12-02 14:11:24
I have 64 spark cores. I have over 80 Million rows of data which amount to 4.2 GB in my cassandra cluster. I now need 82 seconds to process this data. I want this reduced to 8 seconds. Any thoughts on this? Is this even possible? Thanks. This is the part of my spark app I want to improve: axes = sqlContext.read.format("org.apache.spark.sql.cassandra")\ .options(table="axes", keyspace=source, numPartitions="192").load()\ .repartition(64*3)\ .reduceByKey(lambda x,y:x+y,52)\ .map(lambda x:(x.article,[Row(article=x.article,at=x.at,comments=x.comments,likes=x.likes,reads=x.reads,shares=x.shares)]))

CodecNotFoundException: Codec not found for requested operation: [date <-> java.util.Date]

北战南征 提交于 2019-12-02 12:53:23
I am using below datastax versions with java8 <dependency> <groupId>com.datastax.cassandra</groupId> <artifactId>cassandra-driver-core</artifactId> <version>3.7.2</version> </dependency> <dependency> <groupId>com.datastax.cassandra</groupId> <artifactId>cassandra-driver-mapping</artifactId> <version>3.7.2</version> </dependency> My table has a Date column as below cass_table ( data_source_id int, company_id text, create_date date) When I trying to save the data into C* table as below final IndustryCompany four = new IndustryCompany(1,1236, ProdUtils.today()); industryCompanyRepository.save(one

Cassandra batch query performance on tables having different partition keys

拟墨画扇 提交于 2019-12-02 11:58:13
I have test case in which I receive 150k requests per second from a client. My test case requires inserting UNLOGGED batch to multiple tables and having different partition keys BEGIN UNLOGGED BATCH update kspace.count_table set counter=counter+1 where source_id= 1 and name='source_name' and pname='Country' and ptype='text' and date='2017-03-20' and pvalue=textAsBlob('US') update kspace.count_table set counter=counter+1 where source_id= 1 and name='source_name' and pname='City' and ptype='text' and date='2017-03-20' and pvalue=textAsBlob('Dallas') update kspace.count_table set counter=counter