How to Read multiple files in Python for Pandas separate dataframes

自作多情 提交于 2021-02-07 10:19:22

问题


I am trying to read 6 files into 7 different data frames but I am unable to figure out how should I do that. File names can be complete random, that is I know the files but it is not like data1.csv data2.csv.

I tried using something like this:

import sys
import os
import numpy as np
import pandas as pd
from datetime import datetime, timedelta
f1='Norway.csv'
f='Canada.csv'
f='Chile.csv'

Norway = pd.read_csv(Norway.csv)
Canada = pd.read_csv(Canada.csv)
Chile = pd.read_csv(Chile.csv )

I need to read multiple files in different dataframes. it is working fine when I do with One file like

file='Norway.csv
Norway = pd.read_csv(file)

And I am getting error :

NameError: name 'norway' is not defined

回答1:


You can read all the .csv file into one single dataframe.

for file_ in all_files:
    df = pd.read_csv(file_,index_col=None, header=0)
    list_.append(df)

# concatenate all dfs into one
big_df = pd.concat(dfs, ignore_index=True)

and then split the large dataframe into multiple (in your case 7). For example, -

import numpy as np
num_chunks = 3  
df1,df2,df3 = np.array_split(big_df,num_chunks)

Hope this helps.




回答2:


After googling for a while looking for an answer, I decided to combine answers from different questions into a solution to this question. This solution will not work for all possible cases. You have to tweak it to meet all your cases.

check out the solution to this question

 # import libraries
import pandas as pd
import numpy as np
import glob
import os
# Declare a function for extracting a string between two characters
def find_between( s, first, last ):
    try:
        start = s.index( first ) + len( first )
        end = s.index( last, start )
        return s[start:end]
    except ValueError:
        return ""
path = '/path/to/folder/containing/your/data/sets' # use your path
all_files = glob.glob(path + "/*.csv")
list_of_dfs = [pd.read_csv(filename, encoding = "ISO-8859-1") for filename in all_files]
list_of_filenames = [find_between(filename, 'sets/', '.csv') for filename in all_files] # sets is the last word in your path
# Create a dictionary with table names as the keys and data frames as the values
dfnames_and_dfvalues = dict(zip(list_of_filenames, list_of_dfs))


来源:https://stackoverflow.com/questions/54766821/how-to-read-multiple-files-in-python-for-pandas-separate-dataframes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!