问题
I'm trying to figure out how to combine multiple columns, excluding NA values.
Input dataframe:
data <- data.frame(
id = c(1:3),
Item1 = c("Egg", "", ""),
Item2 = c("Chicken", "Flour", ""),
Item3 = c("", "", "Bread"),
Item4 = c("", "Milk", "")
)
Desired dataframe:
desired <- data.frame(
id = c(1:3),
Item1 = c("Egg", "", ""),
Item2 = c("Chicken", "Flour", ""),
Item3 = c("", "", "Bread"),
Item4 = c("", "Milk", ""),
Combine = c("Egg, Chicken", "Flour, Milk", "Bread")
)
I have tried combining the values using the following code:
data$Combine = paste(data$Item1, data$Item2, data$Item3, data$Item4, sep=",")
The issue is that I'm getting results like this:
Egg,Chicken,,
,Flour,,Milk
,,Bread,
回答1:
If we use the similar approach as in the OP's post, replace the leading/lagging spaces with "" and those having more than one repeating , with a single , using gsub
data$Combine <- gsub(",{2,}", ",",
gsub("^,+|,+$", "", do.call(paste, c(data[-1], sep=","))))
data$Combine
#[1] "Egg,Chicken" "Flour,Milk" "Bread"
Or another option is to use paste, remove the leading/lagging spaces (trimws) and then replace one or more spaces (\\s+) with a , using gsub
gsub("\\s+", ",", trimws(do.call(paste, data[-1])))
#[1] "Egg,Chicken" "Flour,Milk" "Bread"
data
data <- structure(list(ID = 1:3, Item1 = c("Egg", "", ""), Item2 = c("Chicken",
"Flour", ""), Item3 = c("", "", "Bread"), Item4 = c("", "Milk",
"")), .Names = c("ID", "Item1", "Item2", "Item3", "Item4"),
class = "data.frame", row.names = c(NA, -3L))
回答2:
Adding ", " for non-empty values.
data1 <- sapply(data[-1], function(x) ifelse(x != "", paste(x, " ", sep = ","), ""))
data1 <- data.frame(id = c(1:3), data1)
Creating the new column.
data$Combine <- paste0(data1$Item1, data1$Item2, data1$Item3, data1$Item4)
Cutting the last symbols.
data$Combine <- sapply(data$Combine, function(t) substr(t, 1, nchar(t)-2))
data
来源:https://stackoverflow.com/questions/41523123/combine-multiple-columns-excluding-null-values