I get a folder with 1 million files in it.
I would like to begin process immediately, when listing files in this folder, in Python or other script langage.
T
For people coming in off Google, PEP 471 added a proper solution to the Python 3.5 standard library and it got backported to Python 2.6+ and 3.2+ as the scandir module on PIP.
Source: https://stackoverflow.com/a/34922054/435253
Python 3.5+:
os.walk has been updated to use this infrastructure for better performance.os.scandir returns an iterator over DirEntry objects.Python 2.6/2.7 and 3.2/3.3/3.4:
scandir.walk is a more performant version of os.walkscandir.scandir returns an iterator over DirEntry objects.The scandir() iterators wrap opendir/readdir on POSIX platforms and FindFirstFileW/FindNextFileW on Windows.
The point of returning DirEntry objects is to allow metadata to be cached to minimize the number of system calls made. (eg. DirEntry.stat(follow_symlinks=False) never makes a system call on Windows because the FindFirstFileW and FindNextFileW functions throw in stat information for free)
Source: https://docs.python.org/3/library/os.html#os.scandir