Is there a “proper” way to read CSV files [duplicate]

只谈情不闲聊 提交于 2019-11-27 11:50:38

CsvReader is a pretty good one... it isn't Microsoft, but it works very well, and is a lot faster than some of the alternatives (legacy OleDb etc).

One of the reasons that many people write their own is that CSV isn't quite so simple. For example:

  1. Does the first row contain field names, or not?
  2. Do you support dates? If, so, are they quoted, surrounded by # marks, in a certain day-month-year order?
  3. Does it support linefeeds that occur inside quoted text values? Or does that split the record?
  4. How do you escape a quote inside of a quoted string? Do you double the quote, or use a backslash or other escape character?
  5. What character encoding(s) are supported?
  6. How does it handle escaped control characters? &#XX; or \uXXXX or some other method?

These are some of the reasons people write their own parsers, because they're stuck reading files created with all these different settings. Or they write their own serializers, because the target system has a bunch of these idiosyncrasies.

If you don't care about these issues, just use the most convenient library. But understand they are there.

aSkywalker

The VB namespace has a great TextFieldParser class. I know, c# people don't like to reference a library from that 'basic' language, but it is quite good.

It is located at Microsoft.VisualBasic.FileIO.TextFieldParser

I used to mess with OLEDB, creating column definition files etc - but find the TextFieldParser a very simple and handy tool for parsing any delimited files.

Try CsvHelper (a library I maintain). It's also available via NuGet.

CsvHelper allows you to read your CSV file directly into your custom class.

var streamReader = // Create a reader to your CSV file.
var csvReader = new CsvReader( streamReader );
List<MyCustomType> myData = csvReader.GetRecords<MyCustomType>();

CsvReader will automatically figure out how to match the property names based on the header row (this is configurable). It uses compiled expression trees instead of reflection, so it's very fast.

It is also very extensible and configurable.

After some more investigation, there is also this: http://www.filehelpers.com/

It seems a full framework around reading files, and not just csv files.

(note: just read stuff on the website, have not used it yet)

Kent Boogaart

KBCsv is another option, particularly if you require efficiency and the ability to work with massive CSV files.

Disclosure: I wrote KBCsv, hence the "KB" ;)

MusiGenesis

I'm pretty sure you can read a CSV file into a DataTable with one line of code. Once it's in a DataTable, you can sort, filter, iterate etc.

This question has some examples for reading CSVs into DataTables.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!