How exactly “Everything Search” can give me immediately searchable list of 2bln files on my 4TB HDD in less than 10 seconds?

问题

There's windows program "Everything Search" http://www.voidtools.com/ that reads file names of the NTFS volume faster than I assume is possible by recursive descent (it reads filenames of almost 2bln files on 4TB HDD in less than 10 seconds).

I know that it probably reads NTFS folder structure directly of the volume in bulk, and makes sense of it without calling OS filesystem functions.

How exactly can it be done? What system functions should I call to get that information about NTFS volume that fast and how can I parse it into file and directory names? Are there any libraries in any language that help with that?

If you are not sure what I am asking, there are more details in my previous question (I was asked to rephrase it): Can I read whole NTFS directory tree into RAM at once?

回答1:

The NTFS volume has a low-visiblity structure it relies on called the master file table. There are APIs for querying this table directly, but they require some privileges to invoke, because you have to get a handle to the volume. The main function to query the master file table is DeviceIOControl and the control code is FSCTL_ENUM_USN_DATA

The control code appears to be a USN-related code - which is a touch misleading in this particular case - but it will give the basic flavor of the call and related structures. You get back an enumeration of records that look like usn records, but they're thin wrappers around master file table entries.

The records each have FileName, IDs and parent IDs. The FileNames are the "local" name of the file or folder, and to get the full name, you would expect to traverse the table structure.

It is lightning fast - way faster than recursing through the file system. You'll get back (and will have to filter out) things that aren't exposed in any of the normal file APIs - things you definitely don't want to expose to users, for example.

来源：https://stackoverflow.com/questions/32552353/how-exactly-everything-search-can-give-me-immediately-searchable-list-of-2bln

标签

c++

windows

api

ntfs