How to extract columns with same name but different identifiers in R

问题

Sorry if it is too basic, but I am not familiar with R.

I have a data frame with multiple columns having the same column names, so after being imported to R, identifiers have been added. Something like this:

A = c(2, 3, 5)
A.1 = c('aa', 'bb', 'cc')
A.2 = c(TRUE, FALSE, TRUE) 
B = c(1, 2, 5)
B.1 = c('bb', 'cc', 'dd')
B.2 = c(TRUE, TRUE, TRUE) 

df = data.frame(A, A.1, A.2, B, B.1, B.2) 

df
  A A.1   A.2 B  B.1   B.2
1 2  aa  TRUE 1   bb  TRUE
2 3  bb FALSE 2   cc  TRUE
3 5  cc  TRUE 5   dd  TRUE

I would like to extract all columns that have A, regardless of the identifier extension so it becomes like:

  A A.1   A.2 
1 2  aa  TRUE 
2 3  bb FALSE 
3 5  cc  TRUE

I know we can

df2 = df[, c("A", "A.1", "A.2")]

But I have many of this type of columns so I do not want to type in individually. I am sure there are smart ways to do this.

Thanks!

回答1:

Try this to get all the columns with names starting with "A"

df2 = df[, grepl("^A", names( df))]

R's extraction '['-function allows the use of logical indexing in its two-argument mode. You will find the regex functions in R very useful and may I recommend reading ?regex as well as looking for examples on SO and Rhelp Archives by @G. Grothendieck

回答2:

library(stringr)
A = c(2, 3, 5)
A.1 = c('aa', 'bb', 'cc')
A.2 = c(TRUE, FALSE, TRUE) 
B = c(1, 2, 5)
B.1 = c('bb', 'cc', 'dd')
B.2 = c(TRUE, TRUE, TRUE)  
df = data.frame(A, A.1, A.2, B) 
df[,str_detect(names(df),'A')]
  A A.1   A.2
1 2  aa  TRUE
2 3  bb FALSE
3 5  cc  TRUE



#If you want to find out A or B. 
A = c(2, 3, 5)
A.1 = c('aa', 'bb', 'cc')
A.2 = c(TRUE, FALSE, TRUE) 
B = c(1, 2, 5)
B.1 = c('bb', 'cc', 'dd')
F.2 = c(TRUE, TRUE, TRUE) 
df = data.frame(A, A.1, A.2, B,F.2) 
df[,str_detect(names(df),'A|B')]
  A A.1   A.2 B
1 2  aa  TRUE 1
2 3  bb FALSE 2
3 5  cc  TRUE 5

回答3:

If we are using tidyverse, starts_with is one way

library(tidyverse)
df %>%
     select(starts_with("A"))
#  A A.1   A.2
#1 2  aa  TRUE
#2 3  bb FALSE
#3 5  cc  TRUE

来源：https://stackoverflow.com/questions/44121843/how-to-extract-columns-with-same-name-but-different-identifiers-in-r

标签

extract