Can an index be created on a UUID Column?

…衆ロ難τιáo~ 提交于 2019-12-11 03:25:46

问题


Is it possible to create an index on a UUID/TIMEUUID column in Cassandra? I'm testing out a model design which would have an index on a UUID column, but queries on that column always return 0 rows found.

I have a table like this:

create table some_data (site_id int, user_id int, run_id uuid, value int, primary key((site_id, user_id), run_id));

I create an index with this command:

create index idx on some_data (run_id) ;

No errors are thrown by CQL when I create this index.

I have a small bit of test data in the table:

 site_id | user_id | run_id                               | value
---------+---------+--------------------------------------+-----------------
       1 |       1 | 9e118af0-ac92-11e4-81ae-8d1bc921f26d |               3

However, when I run the query:

select * from some_data where run_id = 9e118af0-ac92-11e4-81ae-8d1bc921f26d

CQLSH just returns: (0 rows)

If I use an int for the run_id then the index behaves as expected.


回答1:


Yes, you can create a secondary index on a UUID. The real question is "should you?"

In any case, I followed your steps, and got it to work.

Connected to Test Cluster at 192.168.23.129:9042.
[cqlsh 5.0.1 | Cassandra 2.1.2 | CQL spec 3.2.0 | Native protocol v3]
Use HELP for help.
aploetz@cqlsh> use stackoverflow ;
aploetz@cqlsh:stackoverflow> create table some_data (site_id int, user_id int, run_id uuid, value int, primary key((site_id, user_id), run_id));
aploetz@cqlsh:stackoverflow> create index idx on some_data (run_id) ;
aploetz@cqlsh:stackoverflow> INSERT INTO some_data (site_id, user_id, run_id, value) VALUES (1,1,9e118af0-ac92-11e4-81ae-8d1bc921f26d,3);
aploetz@cqlsh:stackoverflow> select * from usr_rec3 where run_id = 9e118af0-ac92-11e4-81ae-8d1bc921f26d;
code=2200 [Invalid query] message="unconfigured columnfamily usr_rec3"
aploetz@cqlsh:stackoverflow> select * from some_data where run_id = 9e118af0-ac92-11e4-81ae-8d1bc921f26d;

 site_id | user_id | run_id                               | value
---------+---------+--------------------------------------+-------
       1 |       1 | 9e118af0-ac92-11e4-81ae-8d1bc921f26d |     3

(1 rows)

Notice though, that when I ran this command, it failed:

select * from usr_rec3 where run_id = 9e118af0-ac92-11e4-81ae-8d1bc921f26d

Are you sure that you didn't mean to select from some_data instead?

Also, creating secondary indexes on high-cardinality columns (like a UUID) is generally not a good idea. If you need to query by run_id, then you should revisit your data model and come up with an appropriate query table to serve that.

Clarification:

  • Using secondary indexes in general is not considered good practice. In the new book Cassandra High Availability, Robbie Strickland identifies their use as an anti-pattern, due to poor performance.
  • Just because a column is of the UUID data type doesn't necessarily make it high-cardinality. That's more of a data model question for you. But knowing the nature of UUIDs and their underlying purpose toward being unique, is setting off red flags.
  • Put these two points together, and there isn't anything about creating an index on a UUID that sounds appealing to me. If it were my cluster, and (more importantly) I had to support it later, I wouldn't do it.


来源:https://stackoverflow.com/questions/28327945/can-an-index-be-created-on-a-uuid-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!