Column-family concept and data model

前端 未结 3 1208
迷失自我
迷失自我 2020-12-30 23:27

I\'m investigating the different types of NoSQL database types and I\'m trying to wrap my head around the data model of column-family stores, such as Bigtable, HBase and Cas

3条回答
  •  误落风尘
    2020-12-30 23:35

    The Cassandra database follows your first model, I think. A ColumnFamily is a collection of rows, which can contain any columns, in a sparse fashion (so each row can have different collection of column names, if desired). The number of columns allowed in a row is almost unlimited (2 billion in Cassandra v0.7).

    A key point is that row keys must be unique within a column family, by definition - but can be re-used in other column families. So you can store unrelated data about the same key in different ColumnFamilies.

    In Cassandra this matters because the data in a particular column family is stored in the same files on disk - so it is more efficient to place data items that are likely to be retrieved together, in the same ColumnFamily. This is partly a practical speed concern, but also a matter of organising your data into a clear schema. This touches upon your second definition - one might consider all the data about a particular key to be a "row", but partitioned by Column Family. However, in Cassandra it is not really a single row, because the data in one ColumnFamily can be changed independently of the data in other ColumnFamilies for the same row key.

提交回复
热议问题