cassandra - how to perform table query?

ぐ巨炮叔叔 提交于 2019-12-11 10:19:02

问题


I am trying to perform a query using 2 tables:

CREATE TABLE users(
  id_ UUID PRIMARY KEY,
  username text,
  email text,
  );

CREATE TABLE users_by_email(
  id UUID,
  email text PRIMARY KEY
)

In this cas, how to perform a query by email?


回答1:


I am assuming that you also want username returned in the query. You cannot JOIN tables in Cassandra. So to do that, you will have to add that column to your users_by_email table:

CREATE TABLE users_by_email(
  id UUID,
  email text PRIMARY KEY,
  username text,
);

Then, simply query that table by email address.

> SELECT id, email, username FROM users_by_email WHERE email='mreynolds@serenity.com';

 id                                   | email                  | username
--------------------------------------+------------------------+----------
 d8e57eb4-c837-4bd7-9fd7-855497861faf | mreynolds@serenity.com |      Mal

(1 rows)



回答2:


I am assuming in the case above you are specifically trying to retrieve the username by the email.

Short Answer:

There is no way in Cassandra that you are going to be able to get the username from the email in a single query using the table structure you have defined. You would need to query users_by_email to get the id and then query users to get the username. A better option would be to add the username column to the users_by_email table.

Long Answer:

Due to the underlying mechanisms by which Cassandra stores data on disk the only available parameters you may use in a where clause have to be in the Primary Key. The Primary Key is made up of 2 different types of keys. First is the partition key which is used to physically separate files on disk and between nodes in the cluster. Second are the cluster keys which are used to organize data stored in a partition and aid in efficient retrieval of data. One other critical part to note is that if you use a WHERE clause in your query it must contain all of the partition keys in it for each call. This is to allow for efficient retrieval of the data. If you want to get some more detailed information on the working of the WHERE clause take a look at this link:

http://www.datastax.com/dev/blog/a-deep-look-to-the-cql-where-clause

Now that you know what the limitations of the WHERE clause are the question is how do we get around them. First thing you need to know is that Cassandra is not a RDBMS and you cannot perform JOIN's against tables. This means that we need to forget all the rules that we have learned for so many years about how to properly normalize data in a database and begin thinking differently about the problem. In general Cassandra is designed for a table-per-query pattern. This means that for each data access pattern (i.e. query) you are going to run against there is an associated table that contains the data for that query and has the proper keys to allow the data to be filtered appropriately. I am not going to be able to go into all the nitty gritty details of how to properly data model your data but I suggest you take the free Datastax Academy Data modeling course avaliable here:

https://academy.datastax.com/courses/ds220-data-modeling

So as I understand your particular need I think that you can modify your users table to look like this:

CREATE TABLE users_by_email(
  email text,
  username text,
  id_ UUID,
  PRIMARY KEY (email, username)
 );

This table setup will allow you to select the username by email using a query like:

SELECT username FROM users_by_email WHERE email=XXXXX;


来源:https://stackoverflow.com/questions/35551677/cassandra-how-to-perform-table-query

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!