How to wrap or embed generators?

狂风中的少年 提交于 2020-12-26 07:42:08

问题


I'm trying to provide a unified interface for retrieving all files from a single directory or a list of directories.

def get_files(dir_or_dirs):
    def helper(indir):
        file_list = glob.glob("*.txt")
        for file in file_list:
            yield file

    if type(dir_or_dirs) is list:
        # a list of source dirs
        for dir in dir_or_dirs:
            yield helper(dir)
    else:
        # a single source dir
        yield helper(dir_or_dirs)

def print_all_files(file_iter):
    for file in file_iter:
        print(file)        # error here!

Questions:

  1. The error says 'file' is still a generator regardless of the input being a single dir or a list of it. Why is it still a generator?
  2. Is it possible to wrap or embed generators in functions? If so, how to make this work?

回答1:


You are yielding helper() each time:

yield helper(dir)

but helper() itself is a generator.

In Python 3.3 and newer, use yield from instead:

yield from helper(dir)

This delegates control to another generator. From the Yield expressions documentation:

When yield from <expr> is used, it treats the supplied expression as a subiterator. All values produced by that subiterator are passed directly to the caller of the current generator’s methods.

In older Python versions, including Python 2.x, use another loop:

for file in helper(dir):
    yield file

For more information on what yield from does, see PEP 380 -- Syntax for Delegating to a Subgenerator.

Not that you really need the helper function, it does little more than just loop over the glob.glob() results, you can do that directly.

You also need to correct your function to actually use indir; currently you are ignoring that argument, so you only get text files from the current working directory.

Next, you want to use glob.iglob() instead of glob.glob() to get the lazy evaluation over os.scandir() rather than load all results into memory at once. I'd just turn a non-list dir_or_dirs value into a list, then just use one loop:

import glob
import os.path

def get_files(dirs):
    if not isinstance(dirs, list):
        # make it a list with one element
        dirs = [dirs]

    for dir in dirs:
        pattern = os.path.join(dir, '*.txt')
        yield from glob.iglob(pattern)

Now, instead of a single argument that is either a string or a list, I'd use a variable number of arguments instead, with the *args parameter syntax:

def get_files(*dirs):
    for dir in dirs:
        pattern = os.path.join(dir, '*.txt')
        yield from glob.iglob(pattern)

This can be called with 0 or more directories:

for file in get_files('/path/to/foo', '/path/to/bar'):
    # ...


来源:https://stackoverflow.com/questions/44686626/how-to-wrap-or-embed-generators

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!