R: Remove multiple empty columns of character variables

后端未结

关注

 9  1222

I have a data frame where all the variables are of character type. Many of the columns are completely empty, i.e. only the variable headers are there, but no values. Is ther

相关标签:

9条回答

你的背包

2020-12-08 11:28

A simple solution using the purrr package:

purrr::discard(my_data_frame, ~all(is.na(.)))

0 讨论(0)
发布评论:

提交评论
- 加载中...
无人共我

2020-12-08 11:31
If you know the column indices, you can use
```
df[,-c(3, 5, 7)]
```
This will omit columns 3, 5, 7.
0 讨论(0)
发布评论:

提交评论
- 加载中...
野性不改

2020-12-08 11:35
I have a similar situation -- I'm working with a large public records database but when I whittle it down to just the date range and category that I need, there are a ton of columns that aren't in use. Some are blank and some are NA.

The selected answer: https://stackoverflow.com/a/17672737/233467 didn't work for me, but this did:
```
df[!sapply(df, function (x) all(is.na(x) | x == ""))]
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
北海茫月

2020-12-08 11:35
It depends what you mean by empty: Is it NA or "", or can it even be " "? Something like this might work:
```
df[,!apply(df, 2, function(x) all(gsub(" ", "", x)=="", na.rm=TRUE))]
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
悲&欢浪女

2020-12-08 11:39
You can do either of the following:
```
emptycols <- sapply(df, function (k) all(is.na(k)))
df <- df[!emptycols]
```
or:
```
emptycols <- colSums(is.na(df)) == nrow(df)
df <- df[!emptycols]
```
If by empty you mean they are "", the second approach can be adapted like so:
```
emptycols <- colSums(df == "") == nrow(df)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
野的像风

2020-12-08 11:40
Here is something that can be modified to exclude columns containing any variables specied.
```
newdf= df[, apply(df, 2, function(x) !any({is.na(x) | x== "" | 
x== "-4"} ) )] 
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页