How to improve searching with os.walk and fnmatch

一曲冷凌霜 提交于 2019-12-01 20:09:20

I'm not one of those regex maniacs who always resorts to the re hammer to solve all problems, but this actually ran a wee bit over twice as fast in my tests as your fnmatch version:

import os
import re

matches = []

img_re = re.compile(r'.+\.(jpg|png|jpeg|tif|tiff)$', re.IGNORECASE)

for root, dirnames, filenames in os.walk(r"C:\windows"):
    matches.extend(os.path.join(root, name) for name in filenames if img_re.match(name))

The Python looks pretty much ok to me.

You could experiment with

for root, dirnames, filenames in os.walk("C:\\"):
    for extension in extensions:
        matches.extend(os.path.join(root, filename) for filename 
                       in fnmatch.filter(filenames, extension))

If that does not make a difference (I suppose it will not), I believe your harddisk has become the bottleneck in the process (remember, disk == slow and you're iterating over and listing the files of every directory in your system).

If the harddisk is the bottleneck, the results from multiple dir /s ... statements should definitely not be extravagantly faster than the Python solution.

import os
extns = ('.jpg', '.jpeg', '.png', '.tif', '.tiff')
matches = []
for root, dirnames, fns in os.walk("C:\\"):
    matches.extend(
        os.path.join(root, fn) for fn in fns if fn.lower().endswith(extns)
        )
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!