Can an index be created on a UUID Column?

问题

Is it possible to create an index on a UUID/TIMEUUID column in Cassandra? I'm testing out a model design which would have an index on a UUID column, but queries on that column always return 0 rows found.

I have a table like this:

create table some_data (site_id int, user_id int, run_id uuid, value int, primary key((site_id, user_id), run_id));

I create an index with this command:

create index idx on some_data (run_id) ;

No errors are thrown by CQL when I create this index.

I have a small bit of test data in the table:

 site_id | user_id | run_id                               | value
---------+---------+--------------------------------------+-----------------
       1 |       1 | 9e118af0-ac92-11e4-81ae-8d1bc921f26d |               3

However, when I run the query:

select * from some_data where run_id = 9e118af0-ac92-11e4-81ae-8d1bc921f26d

CQLSH just returns: (0 rows)

If I use an int for the run_id then the index behaves as expected.

回答1:

Yes, you can create a secondary index on a UUID. The real question is "should you?"

In any case, I followed your steps, and got it to work.

Connected to Test Cluster at 192.168.23.129:9042.
[cqlsh 5.0.1 | Cassandra 2.1.2 | CQL spec 3.2.0 | Native protocol v3]
Use HELP for help.
aploetz@cqlsh> use stackoverflow ;
aploetz@cqlsh:stackoverflow> create table some_data (site_id int, user_id int, run_id uuid, value int, primary key((site_id, user_id), run_id));
aploetz@cqlsh:stackoverflow> create index idx on some_data (run_id) ;
aploetz@cqlsh:stackoverflow> INSERT INTO some_data (site_id, user_id, run_id, value) VALUES (1,1,9e118af0-ac92-11e4-81ae-8d1bc921f26d,3);
aploetz@cqlsh:stackoverflow> select * from usr_rec3 where run_id = 9e118af0-ac92-11e4-81ae-8d1bc921f26d;
code=2200 [Invalid query] message="unconfigured columnfamily usr_rec3"
aploetz@cqlsh:stackoverflow> select * from some_data where run_id = 9e118af0-ac92-11e4-81ae-8d1bc921f26d;

 site_id | user_id | run_id                               | value
---------+---------+--------------------------------------+-------
       1 |       1 | 9e118af0-ac92-11e4-81ae-8d1bc921f26d |     3

(1 rows)

Notice though, that when I ran this command, it failed:

select * from usr_rec3 where run_id = 9e118af0-ac92-11e4-81ae-8d1bc921f26d

Are you sure that you didn't mean to select from some_data instead?

Also, creating secondary indexes on high-cardinality columns (like a UUID) is generally not a good idea. If you need to query by run_id, then you should revisit your data model and come up with an appropriate query table to serve that.

Clarification:

Using secondary indexes in general is not considered good practice. In the new book Cassandra High Availability, Robbie Strickland identifies their use as an anti-pattern, due to poor performance.
Just because a column is of the UUID data type doesn't necessarily make it high-cardinality. That's more of a data model question for you. But knowing the nature of UUIDs and their underlying purpose toward being unique, is setting off red flags.
Put these two points together, and there isn't anything about creating an index on a UUID that sounds appealing to me. If it were my cluster, and (more importantly) I had to support it later, I wouldn't do it.

来源：https://stackoverflow.com/questions/28327945/can-an-index-be-created-on-a-uuid-column

标签

cassandra

cql

cassandra-2.0