问题
I have a script which looks for a file using a regular expression. The code was the following:
find $dir | grep "$regex"
The script run a bit too slow and I want to optimise it. The search takes some time to perform and I would like to get better performance out of it. I've tried this attempt:
find $dir -regex ".*${regex}.*"
I was expecting slightly faster results as no extra process is created to parse the regular expression.
However the result was different and to my astonishment the command "find | grep" is faster than "find -regex" (although it takes more system time, as one would have expected)
I've timed this behaviour:
Find | grep result
real 0m12.467s
user 0m2.568s
sys 0m7.260s
Find -regex result
real 0m16.778s
user 0m6.772s
sys 0m6.380s
Do you have any idea why the find -regex solution is slower?
回答1:
Most likely because grep and its regex engine has been highly optimized over many years, since that's its only purpose ("do one thing and do it well"). I don't know what regex engine find uses, but it's evidently not as highly refined as grep's, probably because it's a less-often-used secondary feature.
Also, if you are doing anything with this file list, you should really use a more whitespace-safe way of doing this. I don't think grep can take null-delimited input (though it can output it), so you should use find [...] -regex [...] -print0 even though it's slower.
来源:https://stackoverflow.com/questions/10431331/find-regex-is-slower-than-find-grep