R combine duplicate rows by appending columns [duplicate]

橙三吉。 提交于 2020-02-29 07:02:09

问题


I have a large data set with text comments and their ratings on different variables, like so:

df <- data.frame(
  comment = c("commentA","commentB","commentB","commentA","commentA","commentC" 
  sentiment=c(1,2,1,4,1,2), 
  tone=c(1,5,3,2,6,1)
)

Every comment is present between one and 3 times, since multiple people are asked to rate the same comment sometimes.

I'm looking to create a data frame where the "comment" column only has unique values, and the other columns are appended, so any one text comment has as many "sentiment" and "tone" columns as there are ratings (which will result in NA's for comments that have not been rated as often, but that's okay):

df <- data.frame(
  comment = c("commentA","commentB","commentC",
  sentiment.1=c(1,2,2), 
  sentiment.2=c(4,1,NA), 
  sentiment.3=c(1,NA,NA), 
  tone.1=c(1,5,1),
  tone.2=c(2,3,NA),
  tone.3=c(6,NA,NA)
)

I've been trying to figure this out using reshape to go from long to wide using

reshape(df, 
  idvar = "comment",
  timevar = c("sentiment","tone"), 
  direction = "wide"
)

But that results in all possible combinations between sentiment and tone, rather than simply duplicating sentiment and tone independently.

I also tried using gather like so df %>% gather(key, value, -comment), but that only gets me halfway there...

Could anyone please point me in the right direction?


回答1:


You need to create a variable to use as the numbers in the columns. rowid(comment) does the trick.

In dcast you put the row identifiers to the left of ~ and the column identifiers to the right. Then value.var is a character vector of all columns you want to include int this long-to-wide transformation.

library(data.table)
setDT(df)

dcast(df, comment ~ rowid(comment), value.var = c('sentiment', 'tone'))

#     comment sentiment_1 sentiment_2 sentiment_3 tone_1 tone_2 tone_3
# 1: commentA           1           4           1      1      2      6
# 2: commentB           2           1          NA      5      3     NA
# 3: commentC           2          NA          NA      1     NA     NA


来源:https://stackoverflow.com/questions/59310225/r-combine-duplicate-rows-by-appending-columns

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!