Apache Cassandra schema design

烈酒焚心 提交于 2020-01-24 19:41:21

问题


I have following setup:
Have CF items and CF keywords.
Each item have zero, one or more keywords, stored in columns.
Each keyword have one or more items, stored in columns.
It looks like this:


    items {
        dl { name => DELL6400,  keyword:1 => computer, keyword:2 => DELL, keyword:3 => topseller  }
        hp { name => HP12345,   keyword:1 => computer, keyword:2 => HP    }
        no { name => Nokia8210, keyword:1 => phone,    keyword:2 => NOKIA }
    }

    // here I store keys of the items only,
    // in reality I have denormalized most of items columns
    keywords{
        computer  { webpage => www.domain.com/computer , item:dl => dl , item:hp => hp }
        DELL      { webpage => www.domain.com/dell ,     item:dl => dl }
        topseller { webpage => www.domain.com/top ,      item:dl => dl }
        HP        { webpage => www.domain.com/hp ,       item:hp => hp }
        NOKIA     { webpage => www.domain.com/nokia ,    item:no => no }
        phone     { webpage => www.domain.com/phone ,    item:no => no }
    }

when I add new item, I am adding "webpage" column in keywords if neccessary.
when I am removing an item, I am removing column "item:xx" as well

question is how to avoid "empty" keywords such if I remove nokia item "no":


    keywords{
        ...
        NOKIA     { webpage => www.domain.com/nokia }
        phone     { webpage => www.domain.com/phone }
    }

I can count slice item:*, but because of eventual consistency this will be probably wrong aproach.


回答1:


You can add a CounterColumn (http://wiki.apache.org/cassandra/Counters) to keywords CF. Increment it when adding an item to the keyword, and decrement on removal:

keywords{
    computer  { webpage => www.domain.com/computer , count => 2 , item:dl => dl , item:hp => hp }
    ....
}

When reading a row with count == 0, just treat it as deleted. You shouldn't actually delete the 'webpage' column if you read the row with count == 0, since there might be concurrent add operation.




回答2:


this is interesting, but I though about other way - to denormalize the "webpage" thing, e.g.:

[code]

keywords{
    computer  { webpage:dl => www.domain.com/computer , item:dl => dl ,
            webpage:dl => www.domain.com/computer ,  item:hp => hp }
    DELL      { webpage:dl => www.domain.com/dell ,     item:dl => dl }
    topseller { webpage:dl => www.domain.com/top ,      item:dl => dl }
    HP        { webpage:hp => www.domain.com/hp ,       item:hp => hp }
    NOKIA     { webpage:no => www.domain.com/nokia ,    item:no => no }
    phone     { webpage:no => www.domain.com/phone ,    item:no => no }
}

[/code]

in such case when i delete item:xx, i delete webpage:xx as well, and row is auto-removed (ghost) if there is no fields there. However I am still not sure if this is such a bright idea.



来源:https://stackoverflow.com/questions/10838896/apache-cassandra-schema-design

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!