Ruby CSV read multiline fields

可紊 提交于 2019-12-24 08:19:54

问题


I exported tables and queries from SQL, where some of the fields are multi-line.

The Ruby (1.9+) way to read CSV appears to be:

require 'csv'

CSV.foreach("exported_mysql_table.csv", {:headers=>true}) do |row|
    puts row
end

Which works great if my data is like this:

"id","name","email","potato"
1,"Bob","bob@bob.bob","omnomnom"
2,"Charlie","char@char.com","andcheese"
4,"Doug","diggyd@diglet.com","usemeltattack"

(The first line is the header/attributes)

But if I have:

"id","name","address","email","potato"
1,"Bob","--- 
- 101 Cottage row
- Lovely Village
- \"\"
","bob@bob.bob","omnomnom"
2,"Charlie","--- 
- 102 Flame Street
- \"\"
- \"\"
","char@char.com","andcheese"
4,"Doug","--- 
- 103 Dark Cave
- Next to some geo dude
- So many bats
","diggyd@diglet.com","usemeltattack"

Then I get the error:

.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/csv.rb:1894:in `block (2 levels) in shift': Missing or stray quote in line 2 (CSV::MalformedCSVError)

This seems to be because the end of the line doesn't have a close quote, as it spans several lines.

(I tried 'FasterCSV', that gem became 'csv' since ruby 1.9)


回答1:


Your problem is not the multiline but malformed CSV.

Replace the \" and end space after a line end like this:

require 'csv' 

ml = %q{"id","name","address","email","potato" 
1,"Bob","---  
- 101 Cottage row 
- Lovely Village 
- \"\" 
","bob@bob.bob","omnomnom" 
2,"Charlie","---  
- 102 Flame Street 
- \"\" 
- \"\" 
","char@char.com","andcheese" 
4,"Doug","---  
- 103 Dark Cave 
- Next to some geo dude 
- So many bats 
","diggyd@diglet.com","usemeltattack"}

ml.gsub!(/\" \n/,"\"\n").gsub!(/\\\"/,"__")

CSV.parse(ml, {:headers=>true}) do |row|
  puts row
end

This gives:

"id","name","address","email","potato"
1,"Bob","---  
- 101 Cottage row 
- Lovely Village 
- ____
","bob@bob.bob","omnomnom"
etc

If you have no control over the program that delivers the CSV, you have to open the file, read the contents, do a replace and then parse the CSV. I use __ here but you can use other non-conflicting characters.



来源:https://stackoverflow.com/questions/12915383/ruby-csv-read-multiline-fields

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!