I am a new learner of Hadoop.
While reading about Apache HDFS I learned that HDFS is write once file system. Some other distributions ( Cloudera) provides append fea
Though this design decision does impose restrictions, HDFS was built keeping in mind efficient streaming data access. Quoting from Hadoop - The Definitive Guide:
HDFS is built around the idea that the most efficient data processing pattern is a write-once, read-many-times pattern. A dataset is typically generated or copied from source, and then various analyses are performed on that dataset over time. Each analysis will involve a large proportion, if not all, of the dataset, so the time to read the whole dataset is more important than the latency in reading the first record.