Reading files in a particular order in python

假如想象 提交于 2019-11-27 19:42:07

Files on the filesystem are not sorted. You can sort the resulting filenames yourself using the sorted() function:

for infile in sorted(glob.glob('*.txt')):
    print "Current File Being Processed is: " + infile

Note that the os.path.join call in your code is a no-op; with only one argument it doesn't do anything but return that argument unaltered.

Note that your files will sort in alphabetical ordering, which puts 10 before 9. You can use a custom key function to improve the sorting:

import re
numbers = re.compile(r'(\d+)')
def numericalSort(value):
    parts = numbers.split(value)
    parts[1::2] = map(int, parts[1::2])
    return parts

 for infile in sorted(glob.glob('*.txt'), key=numericalSort):
    print "Current File Being Processed is: " + infile

The numericalSort function splits out any digits in a filename, turns it into an actual number, and returns the result for sorting:

>>> files = ['file9.txt', 'file10.txt', 'file11.txt', '32foo9.txt', '32foo10.txt']
>>> sorted(files)
['32foo10.txt', '32foo9.txt', 'file10.txt', 'file11.txt', 'file9.txt']
>>> sorted(files, key=numericalSort)
['32foo9.txt', '32foo10.txt', 'file9.txt', 'file10.txt', 'file11.txt']

You can wrap your glob.glob( ... ) expression inside a sorted( ... ) statement and sort the resulting list of files. Example:

for infile in sorted(glob.glob('*.txt')):

You can give sorted a comparison function or, better, use the key= ... argument to give it a custom key that is used for sorting.

Example:

There are the following files:

x/blub01.txt
x/blub02.txt
x/blub10.txt
x/blub03.txt
y/blub05.txt

The following code will produce the following output:

for filename in sorted(glob.glob('[xy]/*.txt')):
        print filename
# x/blub01.txt
# x/blub02.txt
# x/blub03.txt
# x/blub10.txt
# y/blub05.txt

Now with key function:

def key_func(x):
        return os.path.split(x)[-1]
for filename in sorted(glob.glob('[xy]/*.txt'), key=key_func):
        print filename
# x/blub01.txt
# x/blub02.txt
# x/blub03.txt
# y/blub05.txt
# x/blub10.txt

EDIT: Possibly this key function can sort your files:

pat=re.compile("(\d+)\D*$")
...
def key_func(x):
        mat=pat.search(os.path.split(x)[-1]) # match last group of digits
        if mat is None:
            return x
        return "{:>10}".format(mat.group(1)) # right align to 10 digits.

It sure can be improved, but I think you get the point. Paths without numbers will be left alone, paths with numbers will be converted to a string that is 10 digits wide and contains the number.

glob.glob(os.path.join( '*.txt'))

returns a list of strings, so you can easily sort the list using pythons sorted() function.

sorted(glob.glob(os.path.join( '*.txt')))

You need to change the sort from 'ASCIIBetical' to numeric by isolating the number in the filename. You can do that like so:

import re

def keyFunc(afilename):
    nondigits = re.compile("\D")
    return int(nondigits.sub("", afilename))

filenames = ["file10.txt", "file11.txt", "file9.txt"]

for x in sorted(filenames, key=keyFunc):
   print xcode here

Where you can set filenames with the result of glob.glob("*.txt");

Additinally the keyFunc function assumes the filename will have a number in it, and that the number is only in the filename. You can change that function to be as complex as you need to isolate the number you need to sort on.

for fname in ['file9.txt','file10.txt','file11.txt']:
   with open(fname) as f: # default open mode is for reading
      for line in f:
         # do something with line
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!