Any one know how many bytes occupy per file in namenode of Hdfs? I want to estimate how many files can store in single namenode of 32G memory.
Cloudera recommends 1 GB of NameNode heap space per million blocks. 1 GB for every million files is less conservative but should work too.
Also you don't need to multiply by a replication factor, an accepted answer is wrong.
Using the default block size of 128 MB, a file of 192 MB is split into two block files, one 128 MB file and one 64 MB file. On the NameNode, namespace objects are measured by the number of files and blocks. The same 192 MB file is represented by three namespace objects (1 file inode + 2 blocks) and consumes approximately 450 bytes of memory.
One data file of 128 MB is represented by two namespace objects on the NameNode (1 file inode + 1 block) and consumes approximately 300 bytes of memory. By contrast, 128 files of 1 MB each are represented by 256 namespace objects (128 file inodes + 128 blocks) and consume approximately 38,400 bytes.
Replication affects disk space but not memory consumption. Replication changes the amount of storage required for each block but not the number of blocks. If one block file on a DataNode, represented by one block on the NameNode, is replicated three times, the number of block files is tripled but not the number of blocks that represent them.
Examples:
Even more examples article in the origin article from cloudera.