Autodetect Presence of CSV Headers in a File

后端 未结 6 1545
心在旅途
心在旅途 2020-12-25 13:23

Short question: How do I automatically detect whether a CSV file has headers in the first row?

Details: I\'ve written a small CSV parsing engine th

6条回答
  •  猫巷女王i
    2020-12-25 13:30

    In the purely abstract sense, I don't think there is an foolproof algorithmic answer to your question since it boils down to: "How do I distinguish dataA from dataB if I know nothing about either of them?". There will always be the potential for dataA to be indistinguishable from dataB. That said, I would start with the simple and only add complexity as needed. For example, if examining the first five rows, for a given column (or columns) if the datatype in rows 2-5 are all the same but differ from the datatype in row 1, there's a good chance that a header row is present (increased sample sizes reduce the possibility of error). This would (sorta) solve #1/#3 - perhaps throw an exception if the rows are all populated but the data is indistinguishable to allow the calling program to decide what to do next. For #2, simply don't count a row as a row unless and until it pulls non-null data....that would work in all but an empty file (in which case you'd hit EOF). It would never be foolproof, but it might be "close enough".

提交回复
热议问题