问题
I've got a data.frame
with key/value string
column containing information about features and their values for a set of users. Something like this:
data<-data.frame(id=1:3,statid=c("s003e","s093u","s085t"),str=c("a:1,7:2","a:1,c:4","a:3,b:5,c:33"))
data
# id statid str
# 1 1 s003e a:1,7:2
# 2 2 s093u a:1,c:4
# 3 3 s085t a:3,b:5,c:33
What I'm trying to do is to create a data.frame containing column for every feature. Like this:
data_after<-data.frame(id=1:3,statid=c("s003e","s093u","s085t"),
a=c(1,1,3),b=c(0,0,5),c=c(0,4,33),"7"=c(2,0,0))
data_after
# id statid a b c X7
# 1 1 s003e 1 0 0 2
# 2 2 s093u 1 0 4 0
# 3 3 s085t 3 5 33 0
I was trying to use str_split
from stringr
package and then transform elements of created list to data.frames
(later bind them using for example rbind.fill
from plyr
) but couldn't done it. Any help will be appreciated!
回答1:
You can use dplyr
and tidyr
:
library(dplyr); library(tidyr)
data %>% mutate(str = strsplit(str, ",")) %>% unnest(str) %>%
separate(str, into = c('var', 'val'), sep = ":") %>% spread(var, val, fill = 0)
# id statid 7 a b c
# 1 1 s003e 2 1 0 0
# 2 2 s093u 0 1 0 4
# 3 3 s085t 0 3 5 33
回答2:
We can use cSplit
to do this in a cleaner way. Convert the data to 'long' format by splitting at ,
, then do the split at :
and dcast
from 'long' to 'wide'
library(splitstackshape)
library(data.table)
dcast(cSplit(cSplit(data, "str", ",", "long"), "str", ":"),
id+statid~str_1, value.var="str_2", fill = 0)
# id statid 7 a b c
#1: 1 s003e 2 1 0 0
#2: 2 s093u 0 1 0 4
#3: 3 s085t 0 3 5 33
来源:https://stackoverflow.com/questions/38144082/how-to-transform-a-key-value-string-into-separate-columns