问题
I have a data frame with columns that, when concatenated (row-wise) as a string, would allow me to partition the data frame into a desired form.
> str(data)
\'data.frame\': 680420 obs. of 10 variables:
$ A : chr \"2011-01-26\" \"2011-01-26\" \"2011-02-09\" \"2011-02-09\" ...
$ B : chr \"2011-01-26\" \"2011-01-27\" \"2011-02-09\" \"2011-02-10\" ...
$ C : chr \"2011-01-26\" \"2011-01-26\" \"2011-02-09\" \"2011-02-09\" ...
$ D : chr \"AAA\" \"AAA\" \"BCB\" \"CCC\" ...
$ E : chr \"A00001\" \"A00002\" \"B00002\" \"B00001\" ...
$ F : int 9 9 37 37 37 37 191 191 191 191 ...
$ G : int NA NA NA NA NA NA NA NA NA NA ...
$ H : int 4 4 4 4 4 4 4 4 4 4 ...
For each row, I would like to concatenate the data in columns F, E, D, and C into a string (with the underscore character as separator). Below is my unsuccessful attempt at this:
data$id <- sapply(as.data.frame(cbind(data$F,data$E,data$D,data$C)), paste, sep=\"_\")
And below is the undesired result:
> str(data)
\'data.frame\': 680420 obs. of 10 variables:
$ A : chr \"2011-01-26\" \"2011-01-26\" \"2011-02-09\" \"2011-02-09\" ...
$ B : chr \"2011-01-26\" \"2011-01-27\" \"2011-02-09\" \"2011-02-10\" ...
$ C : chr \"2011-01-26\" \"2011-01-26\" \"2011-02-09\" \"2011-02-09\" ...
$ D : chr \"AAA\" \"AAA\" \"BCB\" \"CCC\" ...
$ E : chr \"A00001\" \"A00002\" \"B00002\" \"B00001\" ...
$ F : int 9 9 37 37 37 37 191 191 191 191 ...
$ G : int NA NA NA NA NA NA NA NA NA NA ...
$ H : int 4 4 4 4 4 4 4 4 4 4 ...
$ id : chr [1:680420, 1:4] \"9\" \"9\" \"37\" \"37\" ...
..- attr(*, \"dimnames\")=List of 2
.. ..$ : NULL
.. ..$ : chr \"V1\" \"V2\" \"V3\" \"V4\"
Any help would be greatly appreciated.
回答1:
Try
data$id <- paste(data$F, data$E, data$D, data$C, sep="_")
instead. The beauty of vectorized code is that you do not need row-by-row loops, or loop-equivalent *apply functions.
Edit Even better is
data <- within(data, id <- paste(F, E, D, C, sep=""))
回答2:
Use unite
of tidyr
package:
require(tidyr)
data <- data %>% unite(id, F, E, D, C, sep = '_')
First parameter is the desired name, all next up to sep
- columns to concatenate.
回答3:
Either stringr::str_c()
or paste()
will work.
require(stringr)
data <- within(data, str_c(F,E,D,C, sep="_")
or else
data <- within(data, paste(F,E,D,C, sep="_")
(stringr
is better performance on large datasets)
来源:https://stackoverflow.com/questions/6308933/concatenate-row-wise-across-specific-columns-of-dataframe