I\'m trying to write a simple Python script that will copy a index.tpl to index.html in all of the subdirectories (with a few exceptions).
I\'m getting bogged down
Check "Getting a list of all subdirectories in the current directory".
Here's a Python 3 version:
import os
dir_list = next(os.walk('.'))[1]
print(dir_list)
def get_folders_in_directories_recursively(directory, index=0):
folder_list = list()
parent_directory = directory
for path, subdirs, _ in os.walk(directory):
if not index:
for sdirs in subdirs:
folder_path = "{}/{}".format(path, sdirs)
folder_list.append(folder_path)
elif path[len(parent_directory):].count('/') + 1 == index:
for sdirs in subdirs:
folder_path = "{}/{}".format(path, sdirs)
folder_list.append(folder_path)
return folder_list
The following function can be called as:
get_folders_in_directories_recursively(directory, index=1) -> gives the list of folders in first level
get_folders_in_directories_recursively(directory) -> gives all the sub folders
import os, os.path
To get (full-path) immediate sub-directories in a directory:
def SubDirPath (d):
return filter(os.path.isdir, [os.path.join(d,f) for f in os.listdir(d)])
To get the latest (newest) sub-directory:
def LatestDirectory (d):
return max(SubDirPath(d), key=os.path.getmtime)
This method nicely does it all in one go.
from glob import glob
subd = [s.rstrip("/") for s in glob(parent_dir+"*/")]
I did some speed testing on various functions to return the full path to all current subdirectories.
tl;dr:
Always use scandir
:
list_subfolders_with_paths = [f.path for f in os.scandir(path) if f.is_dir()]
Bonus: With scandir
you can also simply only get folder names by using f.name
instead of f.path
.
This (as well as all other functions below) will not use natural sorting. This means results will be sorted like this: 1, 10, 2. To get natural sorting (1, 2, 10), please have a look at https://stackoverflow.com/a/48030307/2441026
Results:
scandir
is: 3x faster than walk
, 32x faster than listdir
(with filter), 35x faster than Pathlib
and 36x faster than listdir
and 37x (!) faster than glob
.
Scandir: 0.977
Walk: 3.011
Listdir (filter): 31.288
Pathlib: 34.075
Listdir: 35.501
Glob: 36.277
Tested with W7x64, Python 3.8.1. Folder with 440 subfolders.
In case you wonder if listdir
could be speed up by not doing os.path.join() twice, yes, but the difference is basically nonexistent.
Code:
import os
import pathlib
import timeit
import glob
path = r"<example_path>"
def a():
list_subfolders_with_paths = [f.path for f in os.scandir(path) if f.is_dir()]
# print(len(list_subfolders_with_paths))
def b():
list_subfolders_with_paths = [os.path.join(path, f) for f in os.listdir(path) if os.path.isdir(os.path.join(path, f))]
# print(len(list_subfolders_with_paths))
def c():
list_subfolders_with_paths = []
for root, dirs, files in os.walk(path):
for dir in dirs:
list_subfolders_with_paths.append( os.path.join(root, dir) )
break
# print(len(list_subfolders_with_paths))
def d():
list_subfolders_with_paths = glob.glob(path + '/*/')
# print(len(list_subfolders_with_paths))
def e():
list_subfolders_with_paths = list(filter(os.path.isdir, [os.path.join(path, f) for f in os.listdir(path)]))
# print(len(list(list_subfolders_with_paths)))
def f():
p = pathlib.Path(path)
list_subfolders_with_paths = [x for x in p.iterdir() if x.is_dir()]
# print(len(list_subfolders_with_paths))
print(f"Scandir: {timeit.timeit(a, number=1000):.3f}")
print(f"Listdir: {timeit.timeit(b, number=1000):.3f}")
print(f"Walk: {timeit.timeit(c, number=1000):.3f}")
print(f"Glob: {timeit.timeit(d, number=1000):.3f}")
print(f"Listdir (filter): {timeit.timeit(e, number=1000):.3f}")
print(f"Pathlib: {timeit.timeit(f, number=1000):.3f}")
I just wrote some code to move vmware virtual machines around, and ended up using os.path
and shutil
to accomplish file copying between sub-directories.
def copy_client_files (file_src, file_dst):
for file in os.listdir(file_src):
print "Copying file: %s" % file
shutil.copy(os.path.join(file_src, file), os.path.join(file_dst, file))
It's not terribly elegant, but it does work.