Changing Values from Wide to Long: 1) Group_By, 2) Spread/Dcast [duplicate]

十年热恋 提交于 2019-11-29 14:55:23

You don't want to add a row number (index for the whole data) but instead add the group index with the helper function n(), which represents the number of observations in each group in a grouped_df. Then the spreading should go smoothly...

df %>% group_by(Name) %>%
  mutate(group_index = 1:n() %>% paste0("phone_", .)) %>%
  spread(group_index, Phone_Number)

# A tibble: 4 x 4
# Groups:   Name [4]
 Name phone_1 phone_2 phone_3
 <fctr>  <fctr>  <fctr>  <fctr>
1 Jane Doe 0123451    <NA>    <NA>
2 Jill Doe    <NA>    <NA>    <NA>
3  Jim Doe 0123459 0123450    <NA>
4 John Doe 0123456 0123457 0123458

For the sake of completeness, the rowid() function has a prefix parameter which gives a concise solution:

library(data.table)
dcast(setDT(df), Name ~ rowid(Name, prefix = "Phone_Number"))
       Name Phone_Number1 Phone_Number2 Phone_Number3
1: Jane Doe       0123451          <NA>          <NA>
2: Jill Doe          <NA>          <NA>          <NA>
3:  Jim Doe       0123459       0123450          <NA>
4: John Doe       0123456       0123457       0123458

create a rowid by Name, that will suffice

library(dplyr)
library(tidyr)
library(data.table)

df <- setDT(data.frame(Name = c("John Doe", "John Doe", "John Doe", "Jim Doe", "Jim Doe", "Jane Doe", "Jill Doe" ), 
                 Phone_Number = c("0123456", "0123457","0123458", "0123459", "0123450","0123451", NA)))

df1 <- data.frame(Name = c("John Doe","Jim Doe", "Jane Doe", "Jill Doe" ), 
                  Phone_Number1 = c("0123456", "0123459", "0123451", NA),
                  Phone_Number2 = c("0123457", "0123450", NA, NA),
                  Phone_Number3 = c("0123458", NA, NA, NA))

df[, rowid := rowid(Name)]
dcast.data.table(df, Name ~ rowid, value.var = "Phone_Number")

       Name       1       2       3
1: Jane Doe 0123451      NA      NA
2: Jill Doe      NA      NA      NA
3:  Jim Doe 0123459 0123450      NA
4: John Doe 0123456 0123457 0123458

As was pointed in the comments, there is no need to create a rowdi variable for the task. You can do the following, a more simple and neat code

df <- setDT(data.frame(Name = c("John Doe", "John Doe", "John Doe", "Jim Doe", "Jim Doe", "Jane Doe", "Jill Doe" ), 
                       Phone_Number = c("0123456", "0123457","0123458", "0123459", "0123450","0123451", NA)))

dcast.data.table(df, Name ~ paste0("Phone_Number", rowid(Name)), 
                 value.var = "Phone_Number")

       Name Phone_Number1 Phone_Number2 Phone_Number3
1: Jane Doe       0123451            NA            NA
2: Jill Doe            NA            NA            NA
3:  Jim Doe       0123459       0123450            NA
4: John Doe       0123456       0123457       0123458
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!