Subset data frame based on character vector of column names [duplicate]

后端未结

关注

 3  1091

猫巷女王i

相关标签:

3条回答

南旧

2021-01-07 02:22
You can define the names of the columns you want inside [ (see the help file ?Extract or help("[") for the subset operator [).
```
testdf[ names(testdf)[!names(testdf) %in% varnames] ]
## or
## testdf[, names(testdf)[!names(testdf) %in% varnames] , drop = FALSE]
```
Or, more concisely (thanks @Frank)
```
testdf[ setdiff(names(testdf), varnames)]
  var3
1    1
2    1
3    1
4    1
```
where
```
names(testdf)
# [1] "var1" "var2" "var3"
varnames
# [1] "var1" "var2"
```
And So
```
names(testdf) %in% varnames
# [1]  TRUE  TRUE FALSE
```
And therefore
```
names(testdf)[!names(testdf) %in% varnames]
# [1] "var3"
```
Which is the same as
```
testdf[, "var3" ]
```
And drop = FALSE to stop it 'dropping' to a vector if there's only one column returned.

Also, if you look at the help file for lapply(X, FUN, ...)
```
?lapply
```
lapply returns a list of the same length as X

This is why you're getting a list.

As a bonus - can someone tell me why it is ever useful for lapply to return this nested named list instead of simple vector? It seems very different than, for instance, Python. Thank you.

When you're working with a list, and you want it to remain as a list.
0 讨论(0)
发布评论:

提交评论
- 加载中...
南方客

2021-01-07 02:42
You can also use match which returns an index
```
testdf[-match(varnames,names(testdf))]


#   var3
#1    1
#2    1
#3    1
#4    1
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
傲寒

2021-01-07 02:42

You can access the elements using varnames[[1]] etc. and convert it into a vector, if it makes it easier for you.

Source: https://www.datacamp.com/community/tutorials/r-tutorial-apply-family

lapply takes a list and applies the function to every element of the list. The list can also have another list as an element. So it takes that into consideration and returns that nested structure.

0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题