I need to write data in to Hadoop (HDFS) from external sources like a windows box. Right now I have been copying the data onto the namenode and using HDFS\'s put command to
It seems there is a dedicated page now for this at http://wiki.apache.org/hadoop/MountableHDFS:
These projects (enumerated below) allow HDFS to be mounted (on most flavors of Unix) as a standard file system using the mount command. Once mounted, the user can operate on an instance of hdfs using standard Unix utilities such as 'ls', 'cd', 'cp', 'mkdir', 'find', 'grep', or use standard Posix libraries like open, write, read, close from C, C++, Python, Ruby, Perl, Java, bash, etc.
Later it describes these projects
- contrib/fuse-dfs is built on fuse, some C glue, libhdfs and the hadoop-dev.jar
- fuse-j-hdfs is built on fuse, fuse for java, and the hadoop-dev.jar
- hdfs-fuse - a google code project is very similar to contrib/fuse-dfs
- webdav - hdfs exposed as a webdav resource mapR - contains a closed source hdfs compatible file system that supports read/write NFS access
- HDFS NFS Proxy - exports HDFS as NFS without use of fuse. Supports Kerberos and re-orders writes so they are written to hdfs sequentially.
I haven't tried any of these, but I will update the answer soon as I have the same need as the OP