Quicker to os.walk or glob?

后端 未结 4 680
情深已故
情深已故 2020-12-23 10:06

I\'m messing around with file lookups in python on a large hard disk. I\'ve been looking at os.walk and glob. I usually use os.walk as I find it much neater and seems to b

4条回答
  •  庸人自扰
    2020-12-23 10:32

    I made a research on a small cache of web pages in 1000 dirs. The task was to count a total number of files in dirs. The output is:

    os.listdir: 0.7268s, 1326786 files found
    os.walk: 3.6592s, 1326787 files found
    glob.glob: 2.0133s, 1326786 files found
    

    As you see, os.listdir is quickest of three. And glog.glob is still quicker than os.walk for this task.

    The source:

    import os, time, glob
    
    n, t = 0, time.time()
    for i in range(1000):
        n += len(os.listdir("./%d" % i))
    t = time.time() - t
    print "os.listdir: %.4fs, %d files found" % (t, n)
    
    n, t = 0, time.time()
    for root, dirs, files in os.walk("./"):
        for file in files:
            n += 1
    t = time.time() - t
    print "os.walk: %.4fs, %d files found" % (t, n)
    
    n, t = 0, time.time()
    for i in range(1000):
        n += len(glob.glob("./%d/*" % i))
    t = time.time() - t
    print "glob.glob: %.4fs, %d files found" % (t, n)
    

提交回复
热议问题