I have started to look into Hadoop. If my understanding is right i could process a very big file and it would get split over different nodes, however if the file is compressed t
You can use bz2 as your compress codec, and this format also can been split.