ggplot2 log transformation for data and scales

前端 未结 1 1585
旧时难觅i
旧时难觅i 2020-12-18 14:31

This is a follow-up to my previous question Integrating ggplot2 with user-defined stat_function(), which I\'ve answered myself yesterday. My current problem

相关标签:
1条回答
  • 2020-12-18 15:25

    Finally, I have figured out the issues, removed my previous answer and I'm providing my latest solution below (the only thing I haven't solved is legend panel for components - it doesn't appear for some reason, but for an EDA to demonstrate the presence of mixture distribution I think that it is good enough). The complete reproducible solution follows. Thanks to everybody on SO who helped w/this directly or indirectly.

    library(ggplot2)
    library(scales)
    library(RColorBrewer)
    library(mixtools)
    
    NUM_COMPONENTS <- 2
    
    set.seed(12345) # for reproducibility
    
    data(diamonds, package='ggplot2')  # use built-in data
    myData <- diamonds$price
    
    
    calc.components <- function(x, mix, comp.number) {
    
      mix$lambda[comp.number] *
        dnorm(x, mean = mix$mu[comp.number], sd = mix$sigma[comp.number])
    }
    
    
    overlayHistDensity <- function(data, calc.comp.fun) {
    
      # extract 'k' components from mixed distribution 'data'
      mix.info <- normalmixEM(data, k = NUM_COMPONENTS,
                              maxit = 100, epsilon = 0.01)
      summary(mix.info)
    
      numComponents <- length(mix.info$sigma)
      message("Extracted number of component distributions: ",
              numComponents)
    
      DISTRIB_COLORS <- 
        suppressWarnings(brewer.pal(NUM_COMPONENTS, "Set1"))
    
      # create (plot) histogram and ...
      g <- ggplot(as.data.frame(data), aes(x = data)) +
        geom_histogram(aes(y = ..density..),
                       binwidth = 0.01, alpha = 0.5) +
        theme(legend.position = 'top', legend.direction = 'horizontal')
    
      comp.labels <- lapply(seq(numComponents),
                            function (i) paste("Component", i))
    
      # ... fitted densities of components
      distComps <- lapply(seq(numComponents), function (i)
        stat_function(fun = calc.comp.fun,
                      args = list(mix = mix.info, comp.number = i),
                      size = 2, color = DISTRIB_COLORS[i]))
    
      legend <- list(scale_colour_manual(name = "Legend:",
                                         values = DISTRIB_COLORS,
                                         labels = unlist(comp.labels)))
    
      return (g + distComps + legend)
    }
    
    overlayPlot <- overlayHistDensity(log10(myData), 'calc.components')
    print(overlayPlot)
    

    Result:

    enter image description here

    0 讨论(0)
提交回复
热议问题