How to use glob to only read limited set of files?
I have json files named numbers from 50 to 20000 (e.g. 50.json,51.json,52.json...19999.json,20000.json) within the
You are using the glob syntax incorrectly; the [..] sequence works per character. The following glob would match your files correctly instead:
'1[5-8][0-9][0-9][0-9].*'
Under the covers, glob uses fnmatch which translates the pattern to a regular expression. Your pattern translates to:
>>> import fnmatch
>>> fnmatch.translate('[15000-18000].*')
'[15000-18000]\\..*\\Z(?ms)'
which matches 1 character before the ., a 0, 1, 5 or 8. Nothing else.
glob patterns are quite limited; matching numeric ranges is not easy with it; you'd have to create separate globs for ranges, for example (glob('1[8-9][0-9][0-9][0-9]') + glob('2[0-9][0-9][0-9][0-9]'), etc.).
Do your own filtering instead:
directory = "/Users/Chris/Dropbox"
for filename in os.listdir(directory):
basename, ext = os.path.splitext(filename)
if ext != '.json':
continue
try:
number = int(basename)
except ValueError:
continue # not numeric
if 18000 <= number <= 19000:
# process file
filename = os.path.join(directory, filename)