Trying to parse a CSV file, but still getting the error message Unquoted fields do not allow \\r or \\n (line 2)..
I found here at SO similar topic,
I realize this is an old post but I recently ran into a similar issue with a badly formatted CSV file that failed to parse with the standard Ruby CSV library.
I tried the SmarterCSV gem which parsed the file in no time. It's an external library so it might not be the best solution for everyone but it beats parsing the file myself.
opts = { col_sep: ';', file_encoding: 'iso-8859-1', skip_lines: 5 }
SmarterCSV.process(file, opts).each do |row|
p row[:someheader]
end
If you have to deal with files coming from Excel with newlines in cells there is also a solution.
The big disadvantage of this way is, that no semicolons or no double quotes in strings are allowed.
I choose to go with no semicolons
if file.respond_to?(:read)
csv_contents = file.read
elsif file_data.respond_to?(:path)
csv_contents = File.read(file.path)
else
logger.error "Bad file_data: #{file_data.class.name}: #{file_data.inspect}"
return false
end
result = "string"
csv_contents = csv_contents.force_encoding("iso-8859-1").encode('utf-8') # In my case the files are latin 1...
# Here is the important part (Remove all newlines between quotes):
while !result.nil?
result = csv_contents.sub!(/(\"[^\;]*)[\n\r]([^\;]*\")/){$1 + ", " + $2}
end
CSV.parse(csv_contents, headers: false, :row_sep => :auto, col_sep: ";") do |row|
# do whatever
end
For me the solution works fine, if you deal with large files you could run into problems with it.
If you want to go with no quotes just replace the semicolons in the regex with quotes.
First of all, you should set you column delimiters to ';', since that is not the normal way CSV files are parsed. This worked for me:
CSV.open('file.csv', :row_sep => :auto, :col_sep => ";") do |csv|
csv.each { |a,b,c| puts "#{a},#{b},#{c}" }
end
From the 1.9.2 CSV documentation:
Auto-discovery reads ahead in the data looking for the next
\r\n
,\n
, or\r
sequence. A sequence will be selected even if it occurs in a quoted field, assuming that you would have the same line endings there.
For me I was importing LinkedIn CSV and got the error.
I removed the blank lines like this:
def import
csv_text = File.read('filepath', :encoding => 'ISO-8859-1')
#remove blank lines from LinkedIn
csv_text = csv_text.gsub /^$\n/, ''
@csv = CSV.parse(csv_text, :headers => true, skip_blanks: true)
end
In my case, the first row of the spreadsheet/CSV was a double-quoted bit of introduction text. The error I got was: /Users/.../.rvm/rubies/ruby-2.3.0/lib/ruby/2.3.0/csv.rb:1880:in `block (2 levels) in shift': Unquoted fields do not allow \r or \n (line 1). (CSV::MalformedCSVError)
I deleted the comment with " characters so the .csv ONLY had the .csv data, saved it, and my program worked with no errors.
Simpler solution if the CSV was touched or saved by any program that may have used weird formatting (such as Excel or Spreadsheet):