merge rows csv by id ruby

淺唱寂寞╮ 提交于 2019-12-08 05:16:59

问题


I have a .csv file that, for simplicity, is two fields: ID and comments. The rows of id's are duplicated where each comment field had met max char from whatever table it was generated from and another row was necessary. I now need to merge associative comments together thus creating one row for each unique ID, using Ruby.

To illustrate, I'm trying in Ruby, to make this:

ID | COMMENT
1 | fragment 1
1 | fragment 2
2 | fragment 1
3 | fragment 1
3 | fragment 2
3 | fragment 3

into this:

ID | COMMENT
1 | fragment 1 fragment 2
2 | fragment 1
3 | fragment 1 fragment 2 fragment 3

I've come close to finding a way to do this using inject({}) and hashmap, but still working on getting all data merged correctly. Meanwhile seems my code is getting too complicated with multiple hashes and arrays just to do a merge on selective rows.

What's the best/simplest way to achieve this type of row merge? Could it be done with just arrays?

Would appreciate advice on how one would normally do this in Ruby.


回答1:


Keep the headers and use group by ID:

rows = CSV.read 'comment.csv', :headers => true
rows.group_by{|row| row['ID']}.values.each do |group|
  puts [group.first['ID'], group.map{|r| r['COMMENT']} * ' '] * ' | '
end

You can use 0 and 1 but I think it's clearer to use the header field names.




回答2:


With the following csv file, tmp.csv

1,fragment 11
1,fragment 21
2,fragment 21
2,fragment 22
3,fragment 31
3,fragment 32
3,fragment 33

Try this (demonstrated using irb)

irb> require 'csv'
  => true
irb> h = Hash.new
 => {} 
irb> CSV.foreach("tmp.csv") {|r| h[r[0]] = h.key?(r[0]) ? h[r[0]] + r[1] : r[1]}
 => nil 
irb> h
 => {"1"=>"fragment 11fragment 21", "2"=>"fragment 21fragment 22", "3"=>"fragment 31fragment 32fragment 33"}


来源:https://stackoverflow.com/questions/10973182/merge-rows-csv-by-id-ruby

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!