问题

inp_file=os.getcwd() 
files_comp = pd.read_csv(inp_file,"B00234*.csv", na_values = missing_values, nrows=10)

for f in files_comp:

    df_calculated = pd.read_csv(f, na_values = missing_values, nrows=10)
    col_length=len(df.columns)-1

Hi folks, How can I read 4 csv files in a for a loop. I am getting an error while reading the CSV in above format. Kindly help me

回答1:

You basically need this:

Get a list of all target files. files=os.listdir(path) and then keep only the filenames that start with your pattern and end with .csv. You could also improve it using regular expression (by importing re library for more sophistication, or use glob.glob).

filesnames = os.listdir(path)
filesnames = [f for f in filesnames if (f.startswith("B00234") and f.lower().endswith(".csv"))]

Read in files using a for loop:

dfs = list()
for filename in filesnames:
     df = pd.read_csv(filename)
     dfs.append(df)

Complete Example

We will first make some dummy data and then save that to some .csv and .txt files. Some of these .csv files will begin with "B00234" and some other would not. We will write the dumy data to these files. And then selectively only read in the .csv files into a list of dataframes, dfs.

import pandas as pd
from IPython.display import display

# Define Temporary Output Folder
path = './temp_output'

# Clean Temporary Output Folder
import shutil
reset = True
if os.path.exists(path) and reset:
    shutil.rmtree(path, ignore_errors=True)

# Create Content
df0 = pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]),
                   columns=['a', 'b', 'c'])

display(df0)

# Make Path
import os
if not os.path.exists(path):
    os.makedirs(path)
else:
    print('Path Exists: {}'.format(path))

# Make Filenames
filenames = list()
for i in range(10):
    if i<5:
        # Create Files starting with "B00234"
        filenames.append("B00234_{}.csv".format(i))
        filenames.append("B00234_{}.txt".format(i))
    else:
        # Create Files starting with "B00678"
        filenames.append("B00678_{}.csv".format(i))
        filenames.append("B00678_{}.txt".format(i))

# Create files
# Make files with extensions: .csv and .txt
#            and file names starting 
#            with and without: "B00234"
for filename in filenames:
    fpath = path + '/' + filename
    if filename.lower().endswith(".csv"):
        df0.to_csv(fpath, index=False)
    else:
        with open(fpath, 'w') as f:
            f.write(df0.to_string())

# Get list of target files
files = os.listdir(path)
files = [f for f in files if (f.startswith("B00234") and f.lower().endswith(".csv"))]
print('\nList of target files: \n\t{}\n'.format(files))

# Read each csv file into a dataframe
dfs = list() # a list of dataframes
for csvfile in files:
    fpath = path + '/' + csvfile
    print("Reading file: {}".format(csvfile))
    df = pd.read_csv(fpath)
    dfs.append(df)

The list dfs should have five elements, where each is dataframe read from the files.

Ouptput:

    a   b   c
0   1   2   3
1   4   5   6
2   7   8   9

List of target files: 
    ['B00234_3.csv', 'B00234_4.csv', 'B00234_0.csv', 'B00234_2.csv', 'B00234_1.csv']

Reading file: B00234_3.csv
Reading file: B00234_4.csv
Reading file: B00234_0.csv
Reading file: B00234_2.csv
Reading file: B00234_1.csv

来源：https://stackoverflow.com/questions/58071982/read-csv-in-a-for-loop-using-pandas

标签

python

pandas

data-analysis

read csv in a for loop using pandas

问题

回答1:

Complete Example