问题
This is a follow-up on this question. In data.frame DATA, I have some columns that are constant numbers across the unique rows of the first column called study.name. For example, columns setting, prof and random are constant for all rows of Shin.Ellis and constant for all rows of Trus.Hsu and so on. Including Shin.Ellis and Trus.Hsu, there are 10 unique study.name rows.
I wonder how to find the names of such constant columns?
A solution was provided below (see NAMES) but I wonder why "error" which is not constant throughout is outputted from NAMES?
DATA <- read.csv("https://raw.githubusercontent.com/izeh/m/master/cc.csv")
DATA <- setNames(DATA, sub("\\.\\d+$", "", names(DATA)))
is_constant <- function(x) length(unique(x)) == 1L
(NAMES <- names(Filter(all, aggregate(.~study.name, DATA, is_constant)[-1])) )
# > [1] "setting" "prof" "error" "random" ## "error" is NOT a constant variable
## BUT why it is outputted here!
# Desired output:
# [1] "setting" "prof" "random"
回答1:
We need to pass na.action to take care of the NA elements, otherwise, it would completely remove the whole row
names(Filter(all, aggregate(.~study.name, DATA, is_constant,
na.action = na.pass)[-1]))
#[1] "setting" "prof" "random"
来源:https://stackoverflow.com/questions/59791480/find-the-names-of-constant-columns-in-an-r-data-frame