I know that HDFS is write once and read many times.
Suppose if i want to update a file in HDFS is there any way to do it ?
Thankyou in advance !
If you want to add lines, you must put another file and concatenate files:
hdfs dfs -appendToFile localfile /user/hadoop/hadoopfile
To modify any portion of a file that is already written you have three options:
Get file from hdfs and modify their content in local
hdfs dfs -copyToLocal /hdfs/source/path /localfs/destination/path
or
hdfs dfs -cat /hdfs/source/path | modify...
Use a processing technology to update as Map Reduce or Apache Spark, the result will appear as a directory of files and you will remove old files. It should be the best way.
Install NFS or Fuse, both supports append operations.
NFS Gateway
Hadoop Fuse : mountableHDFS, helps allowing HDFS to be mounted (on most flavors of Unix) as a standard file system using the mount command. Once mounted, the user can operate on an instance of hdfs using standard Unix utilities such as ‘ls’, ‘cd’, ‘cp’, ‘mkdir’, ‘find’, ‘grep’