I have a data frame where all the variables are of character type. Many of the columns are completely empty, i.e. only the variable headers are there, but no values. Is ther
This can also be done by dplyr
and select_if
`select_if(df,function(x){any(!is.na(x))})`
or with is.null()
or x==""
depending on how empty values are defined in your data.
If you're talking about columns where all values are NA
, use remove_empty("cols")
from the janitor package.
If you have character vectors where every value is the empty string ""
, you can first convert those values to NA
throughout your data.frame with na_if
from the dplyr package:
dat <- data.frame(
x = c("a", "b", "c"),
y = c("", "", ""),
z = c(NA, NA, NA),
stringsAsFactors = FALSE
)
dat
#> x y z
#> 1 a NA
#> 2 b NA
#> 3 c NA
library(dplyr)
library(janitor)
dat %>%
mutate_all(funs(na_if(., ""))) %>%
remove_empty("cols")
#> x
#> 1 a
#> 2 b
#> 3 c
If your empty columns are really empty character columns, something like the following should work. It will need to be modified if your "empty" character columns include, say, spaces.
Sample data:
mydf <- data.frame(
A = c("a", "b"),
B = c("y", ""),
C = c("", ""),
D = c("", ""),
E = c("", "z")
)
mydf
# A B C D E
# 1 a y
# 2 b z
Identifying and removing the "empty" columns.
mydf[!sapply(mydf, function(x) all(x == ""))]
# A B E
# 1 a y
# 2 b z
Alternatively, as recommended by @Roland:
> mydf[, colSums(mydf != "") != 0]
A B E
1 a y
2 b z