问题
What's the right way to model many-to-many relationships in Cassandra (using 3.10 at the moment)?
From what answers I was able to find, denormalization into two relationship tables is suggested (as in here, for example: Modeling many-to-many relations in Cassandra 2 with CQL3). But there are problems with that on deletes, and those answers are so sparse they do not mention any details on that.
Suppose we have the following tables:
CREATE TABLE foo (
key UUID PRIMARY KEY,
content TEXT
)
CREATE TABLE bar (
key UUID PRIMARY KEY,
content TEXT
)
CREATE TABLE foo_bar (
foo UUID,
bar UUID,
PRIMARY KEY (foo, bar)
)
CREATE TABLE bar_foo (
bar UUID,
foo UUID,
PRIMARY KEY (bar, foo)
)
This seems to be the suggested answer. However, what happens when we try deleting a bar
record? Updating the bar_foo
table is easy:
DELETE FROM bar_foo WHERE bar = <bar_key>
However, an attempt to update the foo_bar
table fails:
DELETE FROM foo_bar WHERE bar = <bar_key>
with the following error:
InvalidRequest: Error from server: code=2200 [Invalid query] message="Some partition key parts are missing: foo"
This is because the primary key for the foo_bar
table is (foo, bar)
, and we specify only the second part of the primary key in the WHERE clause of the DELETE statement. Cassandra, apparently, requires a prefix of the primary key, and bar
without foo
is not a prefix.
Now, changing the primary key to (bar, foo)
won't help. After all, what would you do, then, if a foo
record gets deleted? And, in any case, the entire purpose of the foo_bar
table is to be able to find all bar
records corresponding to a given foo
record, and a SELECT statement also requires a prefix of the primary key in the WHERE clause (which must, by necessity, be foo
).
Can't do SELECT and then DELETE, either, since a SELECT by bar
won't work, it not being a prefix of the primary key.
So what to do with many-to-many relationships, when deletes are in the picture? Is this even possible to accomplish properly?
回答1:
CREATE TABLE foo (
key UUID PRIMARY KEY,
content TEXT
)
CREATE TABLE bar (
key UUID PRIMARY KEY,
content TEXT
)
CREATE TABLE foo_bar (
foo UUID,
bar UUID,
PRIMARY KEY (foo, bar)
)
CREATE MATERIALIZED VIEW bar_foo AS
SELECT bar, foo FROM foo_bar
WHERE foo IS NOT NULL AND bar IS NOT NULL
PRIMARY KEY (bar, foo)
回答2:
Use sets. https://docs.datastax.com/en/cql/3.3/cql/cql_using/useSet.html
CREATE TABLE foo (
key UUID PRIMARY KEY,
content TEXT
)
CREATE TABLE bar (
key UUID PRIMARY KEY,
content TEXT
)
CREATE TABLE foo_jn_bar (
foo UUID PRIMARY KEY,
bar set<UUID>
)
CREATE TABLE bar_jn_jn (
bar UUID PRIMARY KEY,
foo set<UUID>
)
If you follow the relational habit, you will have huge duplicity in disk data.
来源:https://stackoverflow.com/questions/44550900/many-to-many-in-cassandra-3