How can I transform rows into repeated column based data?

。_饼干妹妹 提交于 2019-12-01 08:27:29

问题


I'm trying to take a dataset that looks like this:

And transform the records into this format:

The resulting format would have two columns, one for the old column names and one column for the values. If there are 10,000 rows then there should be 10,000 groups of data in the new format.

I'm open to all different methods, excel formulas, sql (mysql), or straight ruby code would work for me also. What is the best way to tackle this problem?


回答1:


Just for fun:

# Input file format is tab separated values

# name  search_term address code
# Jim jim jim_address 123
# Bob bob bob_address 124
# Lisa  lisa  lisa_address  126
# Mona  mona  mona_address  129


infile = File.open("inputfile.tsv")

headers = infile.readline.strip.split("\t")
puts headers.inspect
of = File.new("outputfile.tsv","w")
infile.each_line do |line|
  row = line.split("\t")
  headers.each_with_index do |key, index|
    of.puts "#{key}\t#{row[index]}"
  end
end

of.close



# A nicer way, on my machine it does 1.6M rows in about 17 sec

File.open("inputfile.tsv") do | in_file |
  headers = in_file.readline.strip.split("\t")
  File.open("outputfile.tsv","w") do | out_file |
    in_file.each_line do | line |
      row = line.split("\t")
      headers.each_with_index do | key, index | 
        out_file << key << "\t" << row[index]
      end
    end 
  end
end



回答2:


You could add an ID column to the left of your data and use a Reverse PivotTable method.

  • Press Alt+D+P to access the Pivottable Wizard with the steps:

    1.  Multiple Consolidation Ranges
    2a. I will create the page fields
    2b. Range: eg. sheet1!A1:A4 
        How Many Page Fields: 0
    3.  Existing Worksheet: H1
    
  • In the PivotTable:

    Uncheck Row and Column from the Field List
    Double-Click the Grand Total as shown
    




回答3:


destination = File.open(dir, 'a') do |d|   #choose the destination file and open it

    source = File.open(dir , 'r+') do |s|  #choose the source file and open it
      headers = s.readline.strip.split("\t")  #grab the first row of the source file to use as headers
      s.each do |line| #interate over each line from the source

        currentLine = line.strip.split("\t") #create an array from the current line
           count = 0   #track the count of each array index
        currentLine.each do |c| #iterate over each cell of the currentline
              finalNewLine = '"' + "#{headers[count]}" + '"' + "\t" + '"' + "#{currentLine[count]}" + '"' + "\n" #build each new line as one big string
          d.write(finalNewLine) #write final line to the destination file.
          count += 1 #increment the count to work on the next cell in the line
        end

      end
  end

end


来源:https://stackoverflow.com/questions/11674964/how-can-i-transform-rows-into-repeated-column-based-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!