问题
I\'m familiar with being able to extract columns from an R data frame (or matrix) like so:
df.2 <- df[, c(\"name1\", \"name2\", \"name3\")]
But can one use a !
or other tool to select all but those listed columns?
For background, I have a data frame with quite a few column vectors and I\'d like to avoid:
- Typing out the majority of the names when I could just remove a minority
- Using the much shorter
df.2 <- df[, c(1,3,5)]
because when my .csv file changes, my code goes to heck since the numbering isn\'t the same anymore. I\'m new to R and think I\'ve learned the hard way not to use number vectors for larger df\'s that might change.
I tried:
df.2 <- df[, !c(\"name1\", \"name2\", \"name3\")]
df.2 <- df[, !=c(\"name1\", \"name2\", \"name3\")]
And just as I was typing this, found out that this works:
df.2 <- df[, !names(df) %in% c(\"name1\", \"name2\", \"name3\")]
Is there a better way than this last one?
回答1:
An alternative to grep
is which
:
df.2 <- df[, -which(names(df) %in% c("name1", "name2", "name3"))]
回答2:
You can make a shorter call that is also more generalizable with negative-grep:
df.2 <- df[, -grep("^name[1:3]$", names(df) )]
Since grep returns numerics you can use the negative vector indexing to remove columns. You could add further number or more complex patterns.
回答3:
dplyr::select()
has several options for dropping specific columns:
library(dplyr)
drop_columns <- c('cyl','disp','hp')
mtcars %>%
select(-one_of(drop_columns)) %>%
head(2)
mpg drat wt qsec vs am gear carb
Mazda RX4 21 3.9 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21 3.9 2.875 17.02 0 1 4 4
Negating specific column names, the following drops the column "hp" and the columns from "qsec" through "gear":
mtcars %>%
select(-hp, -(qsec:gear)) %>%
head(2)
mpg cyl disp drat wt carb
Mazda RX4 21 6 160 3.9 2.620 4
Mazda RX4 Wag 21 6 160 3.9 2.875 4
You could also negate contains()
, starts_with()
, ends_with()
, or matches()
:
mtcars %>%
select(-contains('t')) %>%
select(-starts_with('a')) %>%
select(-ends_with('b')) %>%
select(-matches('^m.+g$')) %>%
head(2)
cyl disp hp qsec vs gear
Mazda RX4 6 160 110 16.46 0 4
Mazda RX4 Wag 6 160 110 17.02 0 4
回答4:
You could make a custom function to do this if you're using it for your own use to manipulate data. I may do something like this:
rm.col <- function(df, ...) {
x <- substitute(...())
z <- Trim(unlist(lapply(x, function(y) as.character(y))))
df[, !names(df) %in% z]
}
rm.col(mtcars, hp, mpg)
The first argument is the dataframe name. the following ...
are the names of any columns you wish to remove.
回答5:
Old thread, but here's another solution:
df.2 <- subset(df, select=-c(name1, name2, name3))
This was posted in another similar thread (though I can't find it right now). Should be sustainable code in the situation you describe, and is probably easier to read and edit than some of the other options.
回答6:
The easiest way that comes to my mind:
filtered_df<-df[, setdiff(names(df),c("name1","name2") ]
essentially you are computing the set difference between full list of column names and the subset you want to filter out (name1 and name2 above).
来源:https://stackoverflow.com/questions/12208090/selecting-columns-in-r-data-frame-based-on-those-not-in-a-vector