Are there any good sites/services to validate consistency of CSV file ?
The same as W3C validator but for CSV ?
Thanks!
The Open Data Institute is developing a CSV validation service that will allow users to check the structure of their data as well as validate it against a simple schema.
The service is still very much in alpha but can be found here:
http://csvlint.io/
The code for the application and the underlying library are both open source:
https://github.com/theodi/csvlint
https://github.com/theodi/csvlint.rb
The README in the library provides a summary of the errors and warnings that can be generated. The following types of error can be reported:
:wrong_content_type -- content type is not text/csv:ragged_rows -- row has a different number of columns (than the first row in the file):blank_rows -- completely empty row, e.g. blank line or a line where all column values are empty:invalid_encoding -- encoding error when parsing row, e.g. because of invalid characters:not_found -- HTTP 404 error when retrieving the data:quoting -- problem with quoting, e.g. missing or stray quote, unclosed quoted field:whitespace -- a quoted column has leading or trailing whitespaceThe following types of warning can be reported:
:no_encoding -- the Content-Type header returned in the HTTP request does not have a charset parameter:encoding -- the character set is not UTF-8:no_content_type -- file is being served without a Content-Type header:excel -- no Content-Type header and the file extension is .xls:check_options -- CSV file appears to contain only a single column:inconsistent_values -- inconsistent values in the same column. Reported if <90% of values seem to have same data type (either numeric or alphanumeric including punctuation)