Only date range scanning Cassandra CQL timestamp

我与影子孤独终老i 提交于 2019-12-08 07:38:52

问题


I have a table like given below.

CREATE TEST(
 HOURLYTIME TIMESTAMP,
 FULLTIME TIMESTAMP,
 DATA TEXT,
 PRIMARY KEY(HOURLYTIME,FULLTIME)
)

I inserted the record (2014-12-12 00:00:00,2014-12-12 00:00:01,'Hello World')

I would like to search based on date time range in HOURLYTIME field which holds hourly records.When i tried with token() like

select * from TEST where token(HOURLYTIME)=token('2014-12-12')

to get all the records for that date it returns only for one hour record i.e for

 2014-12-12 **00:00:00**

If i add date range

select * from TEST where token(HOURLYTIME)>=token('2014-12-12') AND token(HOURLYTIME)<=token('2014-12-14');

It gives the error : More than one restriction was found for the start bound.

How to resolve this issue.

I am able to scan using FULLTIME but i need to provide ALLOW FILTERING which will scan whole records & inefficient.


回答1:


You are not allowed to restrict the primary key by a range without explicitly demanding it with allow filterting . This prevents queries which require a full table scan which as you note are slow and will not scale for true big data sizes. The reason for this is that the primary key values are randomly hashed so specifying a range of primary key values is basically the same as providing two loosely coupled random numbers. For example in your case dates most likely are not monotonically hashed. This means saying you want dates that hash to a value less that the hash of another value will return a completely random set of data.

The issue here is that your table setup does not allow the queries that you actually want to perform. You need to model your tables so that the information you want can be obtained from a single partition.




回答2:


To make range queries you need to have this column as clustering column.

In this case it will be efficient, cause clustering column are stored sorted. If you want to search the data, you need to specify partition key.

So as an example, where I use device_id as a partition key:

CREATE TABLE IF NOT EXISTS mykeyspace.device_data (
 DEVICE_ID text,
 HOURLYTIME TIMESTAMP,
 FULLTIME TIMESTAMP,
 DATA TEXT,
 PRIMARY KEY (DEVICE_ID, HOURLYTIME, FULLTIME)
);

INSERT INTO mykeyspace.device_data (device_id, hourlytime, fulltime, data)
values('Spam machine', '2014-12-12 00:01:00','2014-12-12 00:00:01','Hello World1');

INSERT INTO mykeyspace.device_data (device_id, hourlytime, fulltime, data)
values('Spam machine', '2014-12-12 00:02:00','2014-12-12 00:00:02','Hello World2');

INSERT INTO mykeyspace.device_data (device_id, hourlytime, fulltime, data)
values('Spam machine', '2014-12-12 00:03:00','2014-12-12 00:00:03','Hello World3');

-- Effective range query
SELECT * FROM mykeyspace.device_data
WHERE device_id = 'Spam machine'
    AND hourlytime > '2014-12-12 00:00:00'
    AND hourlytime < '2014-12-12 00:02:00';

Or another example, where I partition data by day (which will cause spread data across cluster nicely), and perform range queries:

CREATE TABLE IF NOT EXISTS mykeyspace.day_data (
     DAYTIME timestamp,
     HOURLYTIME TIMESTAMP,
     FULLTIME TIMESTAMP,
     DATA TEXT,
     PRIMARY KEY (DAYTIME, HOURLYTIME, FULLTIME)
);

INSERT INTO mykeyspace.day_data (DAYTIME, hourlytime, fulltime, data)
values('2014-12-12', '2014-12-12 00:01:00','2014-12-12 00:00:01','Hello World1');

INSERT INTO mykeyspace.day_data (DAYTIME, hourlytime, fulltime, data)
values('2014-12-12', '2014-12-12 00:02:00','2014-12-12 00:00:02','Hello World2');

INSERT INTO mykeyspace.day_data (DAYTIME, hourlytime, fulltime, data)
values('2014-12-12', '2014-12-12 00:03:00','2014-12-12 00:00:03','Hello World3');

SELECT * FROM mykeyspace.day_data
WHERE daytime = '2014-12-12'
    AND hourlytime > '2014-12-12 00:00:00'
    AND hourlytime < '2014-12-12 00:02:00';

There is very useful article about timeseries data on PlanetCassandra




回答3:


The date range query is working fine. I am using the following versions

[cqlsh 4.1.0 | Cassandra 2.0.4 | CQL spec 3.1.1 | Thrift protocol 19.39.0]

There could be a problem with older versions. Please check.



来源:https://stackoverflow.com/questions/27942152/only-date-range-scanning-cassandra-cql-timestamp

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!