Abbreviate the import of multiple files with loadtxt (Python)

一笑奈何 提交于 2019-12-05 09:45:37

问题


I wanna abbreviate the way I import multiples files with loadtxt, I do the next:

rc1    =loadtxt("20120701_Gp_xr_5m.txt", skiprows=19)
rc2    =loadtxt("20120702_Gp_xr_5m.txt", skiprows=19)
rc3    =loadtxt("20120703_Gp_xr_5m.txt", skiprows=19)
rc4    =loadtxt("20120704_Gp_xr_5m.txt", skiprows=19)
rc5    =loadtxt("20120705_Gp_xr_5m.txt", skiprows=19)
rc6    =loadtxt("20120706_Gp_xr_5m.txt", skiprows=19)
rc7    =loadtxt("20120707_Gp_xr_5m.txt", skiprows=19)
rc8    =loadtxt("20120708_Gp_xr_5m.txt", skiprows=19)
rc9    =loadtxt("20120709_Gp_xr_5m.txt", skiprows=19)
rc10   =loadtxt("20120710_Gp_xr_5m.txt", skiprows=19)

Then I concatenate them using:

GOES   =concatenate((rc1,rc2,rc3,rc4,rc5,rc6,rc7,rc8,rc9,
                     rc10),axis=0)

But my question is: Do I wanna reduce all of this? Maybe with a FOR or something like that. Since the files are a secuence of dates (strings).

I was thinking to do something like this

day= #### i dont know how define a string going from 01 to 31 for example

data="201207"+day+"_Gp_xr_5m.txt"

Then do this, but i think is not correct

GOES=loadtxt(data, skiprows=19)

回答1:


Yes, you can easily get your sub-arrays with a for-loop, or with an equivalent list comprehension. Use the glob module to get the desired file names:

import numpy as np  # you probably don't need this line
from glob import glob

fnames = glob('path/to/dir')
arrays = [np.loadtxt(f, skiprows=19) for f in fnames]
final_array = np.concatenate(arrays)

If memory use becomes a problem, you can also iterate over all files line by line by chaining them and feeding that generator to np.loadtxt.


edit after OP's comment

My example with glob wasn't very clear..

You can use "wildcards" * to match files, e.g. glob('*') to get a list of all files in the current directory. A part of the code above could therefor be written better as:

fnames = glob('path/to/dir/201207*_Gp_xr_5m.txt')

Or if your program already runs from the right directory:

fnames = glob('201207*_Gp_xr_5m.txt')

I forgot this earlier, but you should also sort the list of filenames, because the list of filenames from glob is not guaranteed to be sorted.

fnames.sort()

A slightly different approach, more in the direction of what you were thinking is the following. When variable day contains the day number you can put it in the filename like so:

daystr = str(day).zfill(2)
fname = '201207' + daystr + '_Gp_xr_5m.txt'

Or using a clever format specifier:

fname = '201207{:02}_Gp_xr_5m.txt'.format(day)

Or the "old" way:

fname = '201207%02i_Gp_xr_5m.txt' % day

Then simply use this in a for-loop:

arrays = []
for day in range(1, 32):
    daystr = str(day).zfill(2)
    fname = '201207' + daystr + '_Gp_xr_5m.txt'
    a = np.loadtxt(fname, skiprows=19)
    arrays.append(a)

final_array = np.concatenate(arrays)


来源:https://stackoverflow.com/questions/22431921/abbreviate-the-import-of-multiple-files-with-loadtxt-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!