Put csv files in separate pandas dataframes depending on filename [duplicate]

旧城冷巷雨未停 提交于 2019-12-24 23:26:10

问题


I have a list that contains file names. I want to parse directory and read all the files starting with every element from list and store it in dataframe

Eg:

list1=[abc,bcd,def]

Directory:

abc1.txt   
abc2.txt
abc3.txt

bcd1.txt
bcd2.txt
bcd3.txt

The output should be such that Files starting with 'abc' should be in one pandas dataframe and files starting with 'bcd' in other dataframe etc

My code:

 dfs = []
 for exp in expnames:
     for files in filenames:
         if files.startswith(exp):
              dfs.append(pd.read_csv(file_path+files,sep=',',header=None))
      big_frame = pd.concat(dfs, ignore_index=True)

回答1:


I'm assuming you have a directory where there could be several other files besides the ones you want to read.

import os
import pandas as pd

dfs = { }

for f in os.listdir(dirname):
   for k in list1:
       if f.startswith(k):
          try:
             dfs[k].concat(pd.read_csv(dirname+f, sep=',', header=None))
          except KeyError:
             dfs[k] = pd.read_csv(dirname+f, sep=',', header=None))



回答2:


This will create a dictionary of DataFrames where each DataFrame consists of all files matching the first three letters of our "expressions" (i.e. abc, def et.c.). The keys in the dictionary are the same three letters:

# Some dummy data
filenames = ['abcdefghijkl.txt', 'abcdef.txt',  'defghijk.txt']

# List of combination of certain letters 
exps = ['abc', 'def', 'ghi', 'jkl']

dataframes = {} 
for filename in filenames:
    _df = pd.read_csv(filename)

    key = exps[exps.index(filename[:3])]

    try: 
        dataframes[key] = pd.concat([dataframes[key], _df], ignore_index=True)
    except KeyError:
        dataframes[key] = _df



print(dataframes['abc'])

    a   b   c
0   7   8   9
1  10  11  12
2   1   2   3
3   4   5   6

print(dataframes['def'])
    a   b   c
0   7   8   9
1  10  11  12

The contents of the files above are:

abcdefghijkl.txt

a,b,c
7,8,9
10,11,12

abcdef.txt

a,b,c
1,2,3
4,5,6

defghijkl.txt

a,b,c
7,8,9
10,11,12


来源:https://stackoverflow.com/questions/53118137/put-csv-files-in-separate-pandas-dataframes-depending-on-filename

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!