Find the names of constant columns in an R data.frame

时光毁灭记忆、已成空白 提交于 2020-02-05 06:19:19

问题


This is a follow-up on this question. In data.frame DATA, I have some columns that are constant numbers across the unique rows of the first column called study.name. For example, columns setting, prof and random are constant for all rows of Shin.Ellis and constant for all rows of Trus.Hsu and so on. Including Shin.Ellis and Trus.Hsu, there are 10 unique study.name rows.

I wonder how to find the names of such constant columns?

A solution was provided below (see NAMES) but I wonder why "error" which is not constant throughout is outputted from NAMES?

DATA <- read.csv("https://raw.githubusercontent.com/izeh/m/master/cc.csv")
DATA <- setNames(DATA, sub("\\.\\d+$", "", names(DATA)))

is_constant <- function(x) length(unique(x)) == 1L 

(NAMES <- names(Filter(all, aggregate(.~study.name, DATA, is_constant)[-1])) )

# > [1] "setting" "prof"   "error"   "random"   ## "error" is NOT a constant variable 
                                                ## BUT why it is outputted here!

# Desired output: 
# [1] "setting" "prof" "random"

回答1:


We need to pass na.action to take care of the NA elements, otherwise, it would completely remove the whole row

names(Filter(all, aggregate(.~study.name, DATA, is_constant, 
            na.action = na.pass)[-1]))
#[1] "setting" "prof"    "random" 


来源:https://stackoverflow.com/questions/59791480/find-the-names-of-constant-columns-in-an-r-data-frame

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!