benchmarks: does python have a faster way of walking a network folder?

后端 未结 2 675
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-12-24 12:24

I need to walk through a folder with approximately ten thousand files. My old vbscript is very slow in handling this. Since I\'ve started using Ruby and Python since then,

2条回答
  •  无人及你
    2020-12-24 12:50

    I setup directory structure with the following locally:

    for i in $(seq 1 4500); do
        if [[ $i -lt 100 ]]; then
            dir="$(for j in $(seq 1 $i); do echo -n $i/;done)"
            mkdir -p "$dir"
            touch ${dir}$i
        else
            touch $i
        fi
    done
    

    This creates 99 files with paths that are 1-99 levels deep and 4401 files in the root of the directory structure.

    I used the following ruby script:

    #!/usr/bin/env ruby
    require 'benchmark'
    
    def recursive(path, bench)
      bench.report(path) do
        Dir["#{path}/**/**"]
      end
    end
    
    path = 'files'
    Benchmark.bm {|bench| recursive(path, bench)}
    

    I got the following result:

               user     system      total        real
        files/  0.030000   0.090000   0.120000 (  0.108562)
    

    I use the following python script using os.walk:

    #!/usr/bin/env python
    
    import os
    import timeit
    
    def path_recurse(path):
        for (path, dirs, files) in os.walk(path):
          for folder in dirs:
              yield '{}/{}'.format(path, folder)
          for filename in files:
              yield '{}/{}'.format(path, filename)
    
    if __name__ == '__main__':
        path = 'files'
        print(timeit.timeit('[i for i in path_recurse("'+path+'")]', setup="from __main__ import path_recurse", number=1))
    

    I got the following result:

        0.250478029251
    

    So, it looks like ruby is still performing better. It'd be interesting to see how this one performs on your fileset on the network share.

    It would probably also be interesting to see this script run on python3 and with jython and maybe even with pypy.

提交回复
热议问题