问题
I need some approach for the most efficient way to store file (and folder) paths in DB (near a million records), such that I can search for there names.
Saving the full path is obviously wrong, because a lot of repetition will occur, for example:
C:\Windows
C:\Windows\System
C:\Windows\System\Notepad.exe
For this, I built a simple hierarchical DB, with just three fields:
ID Name Parent
0 C: null
1 Windows 0
2 System 1
3 Notepad.exe 2
And I am recovering item's path by using a "recursive with" statement.
The performance is totally satisfactory, however, I am not satisfied with DB size, using SQLite gives you expected DB size (if you will calculate it size depending on average string length in "Name" column), but I been shocked when I compared it to DB size of the famous file searching utility "Everything", having same number of records (as "everything" reports), there DB is 3-4 times smaller!
Because it is potentially possible to have up to 10 million records in future, I am worried that there should be a more efficient way for storage, any ideas?
P.S This is not about which DB vendor to use, or that even with 10 mil. rec. DB is of 500Mb size still ok, but a conceptual question, if there is more efficient way.
来源:https://stackoverflow.com/questions/35839642/storing-large-amount-of-file-paths-in-db