问题
I have following setup:
Have CF items and CF keywords.
Each item have zero, one or more keywords, stored in columns.
Each keyword have one or more items, stored in columns.
It looks like this:
items {
dl { name => DELL6400, keyword:1 => computer, keyword:2 => DELL, keyword:3 => topseller }
hp { name => HP12345, keyword:1 => computer, keyword:2 => HP }
no { name => Nokia8210, keyword:1 => phone, keyword:2 => NOKIA }
}
// here I store keys of the items only,
// in reality I have denormalized most of items columns
keywords{
computer { webpage => www.domain.com/computer , item:dl => dl , item:hp => hp }
DELL { webpage => www.domain.com/dell , item:dl => dl }
topseller { webpage => www.domain.com/top , item:dl => dl }
HP { webpage => www.domain.com/hp , item:hp => hp }
NOKIA { webpage => www.domain.com/nokia , item:no => no }
phone { webpage => www.domain.com/phone , item:no => no }
}
when I add new item, I am adding "webpage" column in keywords if neccessary.
when I am removing an item, I am removing column "item:xx" as well
question is how to avoid "empty" keywords such if I remove nokia item "no":
keywords{
...
NOKIA { webpage => www.domain.com/nokia }
phone { webpage => www.domain.com/phone }
}
I can count slice item:*, but because of eventual consistency this will be probably wrong aproach.
回答1:
You can add a CounterColumn (http://wiki.apache.org/cassandra/Counters) to keywords CF. Increment it when adding an item to the keyword, and decrement on removal:
keywords{
computer { webpage => www.domain.com/computer , count => 2 , item:dl => dl , item:hp => hp }
....
}
When reading a row with count == 0, just treat it as deleted. You shouldn't actually delete the 'webpage' column if you read the row with count == 0, since there might be concurrent add operation.
回答2:
this is interesting, but I though about other way - to denormalize the "webpage" thing, e.g.:
[code]
keywords{
computer { webpage:dl => www.domain.com/computer , item:dl => dl ,
webpage:dl => www.domain.com/computer , item:hp => hp }
DELL { webpage:dl => www.domain.com/dell , item:dl => dl }
topseller { webpage:dl => www.domain.com/top , item:dl => dl }
HP { webpage:hp => www.domain.com/hp , item:hp => hp }
NOKIA { webpage:no => www.domain.com/nokia , item:no => no }
phone { webpage:no => www.domain.com/phone , item:no => no }
}
[/code]
in such case when i delete item:xx, i delete webpage:xx as well, and row is auto-removed (ghost) if there is no fields there. However I am still not sure if this is such a bright idea.
来源:https://stackoverflow.com/questions/10838896/apache-cassandra-schema-design