pivot_wider based on condition of a 0 or 1

痴心易碎 提交于 2019-12-11 16:28:09

问题


I am trying to use pivot_wider on my data. The data looks like:

       dates yes_no
1 2017-01-01      0
2 2017-01-02      1
3 2017-01-03      0
4 2017-01-04      1
5 2017-01-05      1

Where I am trying to get the expected output to be:

       dates yes_no 2017-01-02_1   2017-01-04_1     2017-01-05_1  
1 2017-01-01      0      0            0                 0
2 2017-01-02      1      1            0                 0
3 2017-01-03      0      0            0                 0
4 2017-01-04      1      0            1                 0 
5 2017-01-05      1      0            0                 1

Where the data has been spread when the yes_no column has a 1 in.

This doesn't work for me:

d %>% 
  mutate(value_for_one_hot = 1) %>%
  pivot_wider(names_from = dates, values_from = value_for_one_hot,
            names_prefix = "date_", values_fill = list(value_for_one_hot = 0)) 

Data:

data.frame(
  dates = c("2017-01-01", "2017-01-02", "2017-01-03", "2017-01-04", "2017-01-05"),
  yes_no = c(0, 1, 0, 1, 1) 
)

回答1:


Create a duplicate column for yes_no and a new column for the column names then do a normal spread or pivot_wider

library(dplyr)
library(tidyr)
df %>% mutate(yes_no_dup=yes_no, cols=if_else(yes_no==1, paste0(dates,'_1'), NA_character_)) %>% 
       spread(cols, yes_no_dup, fill = list(yes_no_dup = 0)) %>% 
       select(-`<NA>`)



回答2:


Here's a data.table approach that does not actually reshape the data.

library(data.table)
setDT(d)

ind <- d[['yes_no']] != 0
cols <- as.character(d[['dates']])[ind]

d[, (cols) := 0L]
d[ind, (cols) := as.data.frame(diag(.N))]

## also valid
# set(d, which(ind), cols, as.data.frame(diag(length(cols))))

d


来源:https://stackoverflow.com/questions/59088378/pivot-wider-based-on-condition-of-a-0-or-1

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!