Eliminating NAs from a ggplot

后端 未结 6 948
暖寄归人
暖寄归人 2020-11-30 09:40

Very basic question here as I\'m just starting to use R, but I\'m trying to create a bar plot of factor counts in ggplot2 and when plotting, get 14 little colored blips repr

相关标签:
6条回答
  • 2020-11-30 10:05

    Not sure if you have solved the problem. For this issue, you can use the "filter" function in the dplyr package. The idea is to filter the observations/rows whose values of the variable of your interest is not NA. Next, you make the graph with these filtered observations. You can find my codes below, and note that all the name of the data frame and variable is copied from the prompt of your question. Also, I assume you know the pipe operators.

    library(tidyverse) 
    
    MyDate %>%
       filter(!is.na(the_variable)) %>%
         ggplot(aes(x= the_variable, fill=the_variable)) + 
            geom_bar(stat="bin") 
    

    You should be able to remove the annoying NAs on your plot. Hope this works :)

    0 讨论(0)
  • 2020-11-30 10:08

    Try remove_missing instead with vars = the_variable. It is very important that you set the vars argument, otherwise remove_missing will remove all rows that contain an NA in any column!! Setting na.rm = TRUE will suppress the warning message.

    ggplot(data = remove_missing(MyData, na.rm = TRUE, vars = the_variable),aes(x= the_variable, fill=the_variable, na.rm = TRUE)) + 
           geom_bar(stat="bin") 
    
    0 讨论(0)
  • 2020-11-30 10:16

    You can use the function subset inside ggplot2. Try this

    library(ggplot2)
    
    data("iris")
    iris$Sepal.Length[5:10] <- NA # create some NAs for this example
    
    ggplot(data=subset(iris, !is.na(Sepal.Length)), aes(x=Sepal.Length)) + 
    geom_bar(stat="bin")
    
    0 讨论(0)
  • 2020-11-30 10:16

    From my point of view this error "Error: Aesthetics must either be length one, or the same length as the data" refers to the argument aes(x,y) I tried the na.omit() and worked just fine to me.

    0 讨论(0)
  • 2020-11-30 10:18

    Additionally, adding na.rm= TRUE to your geom_bar() will work.

    ggplot(data = MyData,aes(x= the_variable, fill=the_variable, na.rm = TRUE)) + 
       geom_bar(stat="bin", na.rm = TRUE)
    

    I ran into this issue with a loop in a time series and this fixed it. The missing data is removed and the results are otherwise uneffected.

    0 讨论(0)
  • 2020-11-30 10:26

    Just an update to the answer of @rafa.pereira. Since ggplot2 is part of tidyverse, it makes sense to use the convenient tidyverse functions to get rid of NAs.

    library(tidyverse)
    airquality %>% 
            drop_na(Ozone) %>%
            ggplot(aes(x = Ozone))+
            geom_bar(stat="bin")
    

    Note that you can also use drop_na() without columns specification; then all the rows with NAs in any column will be removed.

    0 讨论(0)
提交回复
热议问题