how to filter cassandra query by a field in user defined type

試著忘記壹切 提交于 2019-12-17 20:50:59

问题


how to filter cassandra query by user defined type field? i want to create people table in my cassandra database so i create this user-defined-type in my cassandra database.

    create type fullname ( firstname text, lastname text );

and i have this table too.

    create table people ( id UUID primary key, name frozen <fullname> );

and i need to filter my query to know all people with lastname jolie. how can i query this from this table. and totally how is filtering and query in cassandra? I know i can delete fullname type and add firstname and lastname to main table but it is a sample of what i want to do.i must have fullname type.


回答1:


Short answer: you can use secondary indexes to query by fullname UDT. But you cannot query by only a part of your UDT.

// create table, type and index
create type fullname ( firstname text, lastname text );
create table people ( id UUID primary key, name frozen <fullname> );
create index fname_index on your_keyspace.people (name);

// insert some data into it
insert into people (id, name) values (now(), {firstname: 'foo', lastname: 'bar'});
insert into people (id, name) values (now(), {firstname: 'baz', lastname: 'qux'});

// query it by fullname
select * from people where name = { firstname: 'baz', lastname: 'qux' };

// the following will NOT work:
select * from people where name = { firstname: 'baz'};

The reason for such behaviour is a way C* secondary indexes are implemented. In general, it's just another hidden table maintained by C*, in your case defined as:

create table fname_index (name frozen <fullname> primary key, id uuid);

Actually your secondary and primary keys are swapped in this table. So your case is reduced to a more general question 'why can't I query by only a part of PK?':

  • the whole PK value (firstname+lastname) is hashed, the resulting number defines the partition to store your row.
  • for that partition your row is appended to a memtable (and later flushed on disk to SSTable, a file sorted by key)
  • when you want to query only by part of PK (like by firstname only), C* doesn't able to guess the partition to look for (as it doesn't able to compute the hashcode for the whole fullname as lastname is unknown), as your match can be anywhere in any partition requiring full-table scan. C* explicitly forbids these scans, so you have no choice :)

Suggested solutions:

  • split your UDT to essential parts like firstname and lastname and have secondary indexes on it.
  • use Cassandra 3.0 with materialized views feature (actually force cassandra to maintain a custom index for part of your UDT)
  • revisit your data model to be less strict (when no one forces you to use UDTs where they are not helpful)


来源:https://stackoverflow.com/questions/33840105/how-to-filter-cassandra-query-by-a-field-in-user-defined-type

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!