Difference between subset and filter from dplyr

前端 未结 6 1596
Happy的楠姐
Happy的楠姐 2020-12-14 06:03

It seems to me that subset and filter (from dplyr) are having the same result. But my question is: is there at some point a potential difference, for ex. speed, data sizes i

6条回答
  •  遥遥无期
    2020-12-14 06:28

    Interesting. I was trying to see the difference in terms of the resulting dataset and I coulnd't get an explanation to why the "[" operator behaved differently (i.e., to why it also returned NAs):

    # Subset for year=2013
    sub<-brfss2013 %>% filter(iyear == "2013")
    dim(sub)
    #[1] 486088    330
    length(which(is.na(sub$iyear))==T)
    #[1] 0
    
    sub2<-filter(brfss2013, iyear == "2013")
    dim(sub2)
    #[1] 486088    330
    length(which(is.na(sub2$iyear))==T)
    #[1] 0
    
    sub3<-brfss2013[brfss2013$iyear=="2013", ]
    dim(sub3)
    #[1] 486093    330
    length(which(is.na(sub3$iyear))==T)
    #[1] 5
    
    sub4<-subset(brfss2013, iyear=="2013")
    dim(sub4)
    #[1] 486088    330
    length(which(is.na(sub4$iyear))==T)
    #[1] 0
    

提交回复
热议问题