Appending identical CSVs together while removing headers

血红的双手。 提交于 2020-01-04 19:47:09

问题


I am wanting to append 6 CSVs that have identical layouts and headers together.

I've been able to accomplish this by loading each of the 6 csvs into their own seperate data tables and removing the first row of each datatable. Finally I've appended them together using the ImportRow method.

DataTable table1 = csvToDataTable(@"C:\Program Files\Normalization\Scan1.csv");
DataTable table2 = csvToDataTable(@"C:\Program Files\Normalization\Scan2.csv");
DataTable table3 = csvToDataTable(@"C:\Program Files\Normalization\Scan3.csv");
DataTable table4 = csvToDataTable(@"C:\Program Files\Normalization\Scan4.csv");
DataTable table5 = csvToDataTable(@"C:\Program Files\Normalization\Scan5.csv");
DataTable table6 = csvToDataTable(@"C:\Program Files\Normalization\Scan6.csv");

        foreach (DataRow dr in table2.Rows)
        {
            table1.ImportRow(dr);
        }
        foreach (DataRow dr in table3.Rows)
        {
            table1.ImportRow(dr);
        }
        foreach (DataRow dr in table4.Rows)
        {
            table1.ImportRow(dr);
        }
        foreach (DataRow dr in table5.Rows)
        {
            table1.ImportRow(dr);
        }
        foreach (DataRow dr in table6.Rows)
        {
            table1.ImportRow(dr);
        }

        CreateCSVFile(table1, @"C:\Program Files\Normalization\RackMap.csv");

I feel this is clunky and not very scalable but I had trouble dealing with the headers when I tried to append at the CSV level. Any suggestions?

TIA


回答1:


Get a DirectoryInfo of all files matching the mask *.csv

Create a for loop to iterate the results.

Drop the first row when importing each file.

EDIT:

If you just want to combine the files, rather than import into a data table, you could treat them as text files. Concatenate them, dropping the header line each time. Here is an example:

string myPath = @"K:\csv";

DirectoryInfo csvDirectory = new DirectoryInfo(myPath);
FileInfo[] csvFiles = csvDirectory.GetFiles("*.csv");
StringBuilder sb = new StringBuilder();
foreach (FileInfo csvFile in csvFiles)
    using (StreamReader sr = new StreamReader(csvFile.OpenRead()))
    {
        sr.ReadLine(); // Discard header line
        while (!sr.EndOfStream)
            sb.AppendLine(sr.ReadLine());
    }
File.AppendAllText(Path.Combine(myPath, "output.csv"), sb.ToString());



回答2:


As JYelton suggested, you'll definitely want to dynamically find all the *.csv files in your folder, and iterate over them (rather than hardcoding 6 filenames). From that point you might consider an approach like this:

  1. Create a writable filestream for your "destination" file.
  2. For each .CSV file, open a readable filestream on it.
  3. Discard each file's header row by reading to up to and including the first CRLF, and throwing that data away.
  4. Read all the remaining data into your writable stream.
  5. Repeat #2-4 for each CSV file.
  6. Close your writable stream to save the completed file.

This approach will accommodate an arbitrary number of CSV files, and is probably more performance-efficient than working with DataTables.

Note: for sake of brevity & clarity, I've left out some edge-case handling you'll need to do. Like how to handle an empty csv file, or one which contains a header row and nothing else, or one which does not have a trailing CRLF after its final row. Aren't implementation details & edge-case handling fun? ;)




回答3:


If you want to not repeat identical rows, then you can create List of hash codes and in loop, find if list contains row's hash code.

    List<int> rowHashCodes = new List<int>();
    foreach (DataRow dr in table2.Rows)
    {
        int hash = dr.GetHashCode();
        if (rowHashCodes.Contains(hash))
        {
            // We already have this row
        }
        else
        {
            table1.ImportRow(dr);
            rowHashCodes.Add(hash);
        }
    }

May be this is not ideal way for performance point of view, but I hope this can solve your problem.



来源:https://stackoverflow.com/questions/6653624/appending-identical-csvs-together-while-removing-headers

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!