Why HDFS is write once and read multiple times?

后端 未结 4 721
难免孤独
难免孤独 2020-12-11 17:25

I am a new learner of Hadoop.

While reading about Apache HDFS I learned that HDFS is write once file system. Some other distributions ( Cloudera) provides append fea

4条回答
  •  暗喜
    暗喜 (楼主)
    2020-12-11 17:50

    Though this design decision does impose restrictions, HDFS was built keeping in mind efficient streaming data access. Quoting from Hadoop - The Definitive Guide:

    HDFS is built around the idea that the most efficient data processing pattern is a write-once, read-many-times pattern. A dataset is typically generated or copied from source, and then various analyses are performed on that dataset over time. Each analysis will involve a large proportion, if not all, of the dataset, so the time to read the whole dataset is more important than the latency in reading the first record.

提交回复
热议问题