Fast way to find the number of files in one directory on Linux

孤街醉人 提交于 2019-12-07 16:51:41

问题


I am looking for a fast way to find the number of files in a directory on Linux.

Any solution that takes linear time in the number of files in the directory is NOT acceptable (e.g. "ls | wc -l" and similar things) because it would take a prohibitively long amount of time (there are tens or maybe hundreds of millions of files in the directory).

I'm sure the number of files in the directory must be stored as a simple number somewhere in the filesystem structure (inode perhaps?), as part of the data structure used to store the directory entries - how can I get to this number?

Edit: The filesystem is ext3. If there is no portable way of doing this, I am willing to do something specific to ext3.


回答1:


Why should the data structure contain the number? A tree doesn't need to know its size in O(1), unless it's a requirement (and providing that, could require more locking and possibly a performance bottleneck)

By tree I don't mean including subdir contents, but files with -maxdepth 1 -- supposing they are not really stored as a list..

edit: ext2 stored them as a linked list.

modern ext3 implements hashed B-Trees

Having said that, /bin/ls does a lot more than counting, and actually scans all the inodes. Write your own C program or script using opendir() and readdir().

from here:

#include <stdio.h>
#include <sys/types.h>
#include <dirent.h>
int main()
{
        int count;
        struct DIR *d;
        if( (d = opendir(".")) != NULL)
        {
                for(count = 0;  readdir(d) != NULL; count++);
                closedir(d);
        }
        printf("\n %d", count);
        return 0;
}



回答2:


You can use inotify to track and record file create and unlink events in the monitored directory. It would distribute the total time required to maintain file count and allow you to retrieve the current file count instantaneously.




回答3:


The inode for the directory does not store the number of files in it, since usually the file count is not needed separately from the list of names in the directory. The directory inode's link count does indirectly give the number of sub-directories (st_nlink is number of sub-dirs plus two).

I think you have no choice except read through the whole list of files in the directory. find might or might not be faster than ls.

This is an example of why large directories are a problem, even when the directory is implemented using a B-tree.




回答4:


There's no portable way to do this. The low-level file primitives, i.e. readdir, work as if it's a linear list. Clearly, that's an abstraction, and some filesystems might store a count. However, accessing it is inherently filesystem-specific.




回答5:


If you are willing to jump through hoops you may have each directory in a different filesystem, use quotas, and get the info with the "repquota" command.



来源:https://stackoverflow.com/questions/3283582/fast-way-to-find-the-number-of-files-in-one-directory-on-linux

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!