Apply different functions to data frame columns depending on the column names matching a pattern

匿名 (未验证) 提交于 2019-12-03 02:29:01

问题:

Given a data frame:

l$`__a` <- data.frame(`__ID` = stringi::stri_rand_strings(10, 1),  col = stringi::stri_rand_strings(10, 1), check.names = F )

And two supporting functions:

prefixColABC <- function(dfCol) { paste0("ABC_", dfCol) }  prefixColDEF <- function(dfCol) {   paste0("DEF_", dfCol) }

How can I apply the first function for data frame column names staring with __ and the second for all other columns?

To solve this problem, I thought I would subset first all columns with names starting with __, apply prefixColABC to them, then subset all others and apply prefixColDEF to them. Then I would use cbind() to put all of the columns together into one data frame again.

Here's some of my progress:

Here's how the first function can be applied to all columns:

as.data.frame( apply(l$`__a`, 2, prefixColABC) )

And here's how I can subset the columns. All with column names starting with __:

l$`__a`[ grep(pattern = "^__", l$`__a`), 1 ]

I don't know how to subset all other columns that don't match this pattern. And I don't know how to set up the condition inside the apply statement

I think this question is similar to mine, but does not select the columns based on matching a pattern: R Applying different functions to different data frame columns

回答1:

Try this assuming that the input data frame is called dd:

hasPrefix <- grepl("^__", names(dd)) dd[, hasPrefix] <- lapply(dd[, hasPrefix, drop = FALSE], prefixColABC) dd[, !hasPrefix] <- lapply(dd[, !hasPrefix, drop = FALSE], prefixColDEF)

giving:

> dd     __ID   col 1  ABC_G DEF_x 2  ABC_n DEF_U 3  ABC_c DEF_G 4  ABC_O DEF_X 5  ABC_p DEF_E 6  ABC_U DEF_j 7  ABC_M DEF_G 8  ABC_0 DEF_l 9  ABC_V DEF_i 10 ABC_B DEF_u

Note: The input dd, prior to modification, is:

dd <- structure(list(`__ID` = structure(c(4L, 6L, 3L, 7L, 8L, 9L, 5L,  1L, 10L, 2L), .Label = c("0", "B", "c", "G", "M", "n", "O", "p",  "U", "V"), class = "factor"), col = structure(c(8L, 7L, 2L, 9L,  1L, 4L, 2L, 5L, 3L, 6L), .Label = c("E", "G", "i", "j", "l",  "u", "U", "x", "X"), class = "factor")), .Names = c("__ID", "col" ), row.names = c(NA, -10L), class = "data.frame")


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!