Modifying multiple .csv files from same directory in python

血红的双手。 提交于 2021-02-10 19:07:17

问题


I need to modify multiple .csv files in my directory. Is it possible to do it with a simple script? My .csv columns are in this order:

 X_center,Y_center,X_Area,Y_Area,Classification

I would like to change them to this order:

 Classification,X_center,Y_center,X_Area,Y_Area

So far I managed to write:

import os
import csv

for file in os.listdir("."):
    if file.endswith(".csv"):
        with open('*.csv', 'r') as infile, open('reordered.csv', 'a') as outfile:
            fieldnames = ['Classification','X_center','Y_center','X_Area','Y_Area']
            writer = csv.DictWriter(outfile, fieldnames=fieldnames)
            writer.writeheader()
            for row in csv.DictReader(infile):
                writer.writerow(row)
        csv_file.close()

But it changes every row to Classification,X_center,Y_center,X_Area,Y_Area (replaces values in every row). Is it possible to open a file, re-order the columns and save the file under the same name? I checked similar solutions that were given on other threads but no luck. Thanks for the help!


回答1:


First off, I think your problem lay in opening '*.csv' in the loop instead of opening file. Also though, I would recommend never overwriting your original input files. It's much safer to write copies to a new directory. Here's a modified version of your script which does that.

import os
import csv
import argparse

ap = argparse.ArgumentParser()
ap.add_argument("-i", "--input", required=True)
ap.add_argument("-o", "--output", required=True)
args = vars(ap.parse_args())


if os.path.exists(args["output"]) and os.path.isdir(args["output"]):
        print("Writing to {}".format(args["output"]))
else:
        print("Cannot write to directory {}".format(args["output"]))
        exit()

for file in os.listdir(args["input"]):
    if file.endswith(".csv"):
        print("{} ...".format(file))
        with open(os.path.join(args["input"],file), 'r') as infile, open(os.path.join(args["output"], file), 'w') as outfile:
            fieldnames = ['Classification','X_center','Y_center','X_Area','Y_Area']
            writer = csv.DictWriter(outfile, fieldnames=fieldnames)
            writer.writeheader()
            for row in csv.DictReader(infile):
                writer.writerow(row)
        outfile.close()

To use it, create a new directory for your outputs and then run like so:

python this.py -i input_dir -o output_dir

Note: From your question you seemed to want each file to be modified in place so this does basically that (outputs a file of the same name, just in a different directory) but leaves your inputs unharmed. If you actually wanted all the files reordered into a single file as your code open('reordered.csv', 'a') implies, you could easily do that by moving the output initialization code so it is executed before entering the loop.




回答2:


Using pandas & pathlib.

from pathlib import Path # available in python 3.4 + 
import pandas as pd
dir = r'c:\path\to\csvs' # raw string for windows.
csv_files = [f for f in Path(dir).glob('*.csv')] # finds all csvs in your folder.


cols = ['Classification','X_center','Y_center','X_Area','Y_Area']

for csv in csv_files: #iterate list
    df = pd.read_csv(csv) #read csv
    df[cols].to_csv(csv.name,index=False)
    print(f'{csv.name} saved.')

naturally, if there a csv without those columns then this code will fail, you can add a try/except if that's the case.



来源:https://stackoverflow.com/questions/59292999/modifying-multiple-csv-files-from-same-directory-in-python

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!