问题
I have already found other versions of the same question but I was not able to adapt the answers given there for my problem. Here is an older link:
The op there had data consisting of two columns only - and the given answer handles this really nicely. But what about more than two columns? Is there a way to adapt the linked code snippet?
Here is an example:
ve <- rbind("4,2","3","1,2,3","5","6","7")
expl <- cbind(head(mtcars),ve)
row.names mpg cyl disp hp drat wt qsec vs am gear carb ve
1 Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 4,2
2 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 3
3 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 1,2,3
4 Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 5
5 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 6
6 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 7
I would need:
row.names mpg cyl disp hp drat wt qsec vs am gear carb ve
1 Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 4
2 Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 2
3 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 3
4 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 1
5 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 2
6 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 3
7 Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 5
8 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 6
9 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 7
Thank you!
回答1:
Here's an attempt using base R only (which also preserves the row names- in a way at least...)
ve <- strsplit(ve, ",")
Res <- expl[rep(seq_len(nrow(expl)), sapply(ve, length)), ]
Res$ve <- unlist(ve)
Res
# mpg cyl disp hp drat wt qsec vs am gear carb ve
# Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 4
# Mazda RX4.1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 2
# Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 3
# Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 1
# Datsun 710.1 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 2
# Datsun 710.2 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 3
# Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 5
# Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 6
# Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 7
Or using data.table
, one option is
library(data.table)
setDT(expl)[,
strsplit(as.character(ve), ","),
c(names(expl)[-length(expl)])
]
Another option would be
setkey(expl, ve)[setDT(expl)[, strsplit(as.character(ve), ","), ve]]
回答2:
Try unnest
from the tidyr
package. My example uses dplyr
, but you can also accomplish with base functions.
library(dplyr)
library(tidyr)
expl %>%
mutate(ve = strsplit(as.character(ve), ",")) %>%
unnest(ve)
回答3:
I would recommend cSplit
from my "splitstackshape" package.
Since your example has rownames
, I've converted your example data to a data.table
with the keep.rownames = TRUE
argument.
library(splitstackshape)
cSplit(as.data.table(expl, keep.rownames = TRUE), "ve", ",", "long")
# rn mpg cyl disp hp drat wt qsec vs am gear carb ve
# 1: Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 4
# 2: Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 2
# 3: Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 3
# 4: Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 1
# 5: Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 2
# 6: Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 3
# 7: Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 5
# 8: Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 6
# 9: Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 7
来源:https://stackoverflow.com/questions/28285169/split-comma-separated-column-entry-into-rows