Using plotly with zeppellin in scala

≯℡__Kan透↙ 提交于 2020-02-04 02:14:05

问题


I want to display my results in the form of a histogram in Zeppelin. I came across plotly. My code is in scala and I would like to know the steps to incorporate plotly into zeppelin using scala. Or is there any better way(libraries) that can be used to draw a histogram in Zeppelin(Scala)?


回答1:


If you have a dataframe called plotTemp with columns "id","degree" then you can do the following:

  1. In a scala window register the dataframe as a temporary table

plotTemp.registerTempTable("plotTemp")

  1. Then switch to the SQL interpreter in a new window

    %sql
    select degree, count(1) nInBin
    from plotTemp
    group by degree
    order by degree
    

You can then click on the bar plot icon and you should see what you are looking for

Example of distribution plot done in Zeppelin




回答2:


After trying basically every available solution I eventually settled for vegas-viz. If you look at their project's page on GitHub, they claim to be "The Missing MatPlotLib for Scala + Spark". Although that sounds a little bit exaggerated to me at the moment, the library does its work and does it well.

This is the procedure I suggest for drawing a Bar Chart (that's what you need for histograms, basically) in the Zeppelin's Spark Interpreter:

  1. import dependencies (please check the vegas maven repository for the latest versions)

    %dep  
    z.load("org.vegas-viz:vegas_2.11:0.3.11")
    z.load("org.vegas-viz:vegas-spark_2.11:0.3.11")
    

Note that vegas-spark is needed only if you want to draw directly from a DataFrame, see below.

  1. import packages

    import vegas._  
    import vegas.render.WindowRenderer._
    
  2. draw chart

    val plot = Vegas("Sample Column Chart")
      .withData(
        Seq(
          Map("country" -> "USA", "population" -> 314),
          Map("country" -> "UK", "population" -> 64),
          Map("country" -> "DK", "population" -> 80)
        )
      )
      .encodeX("country", Nom)
      .encodeY("population", Quant)
      .mark(Bar)
    plot.show
    

    The result should be similar to the image below:

  1. you can even draw an image directly from a DataFrame if you have added vegas-spark among the dependencies (see point 1.) but you also need an extra import for this to work:

    import vegas.sparkExt._
    
    val df = Seq(
      ("USA", 314),
      ("UK", 64),
      ("DK", 80)
    ).toDF("country", "population")
    
    val plot = Vegas("Sample Column Chart", width=600, height=320)
      .withDataFrame(df)
      .encodeX("country", Nom)
      .encodeY("population", Quant)
      .mark(Bar)
    plot.show
    

The result should be the same as above.




回答3:


I just released spark-highcharts. With following code, you can create a histogram.

import com.knockdata.spark.highcharts._
import com.knockdata.spark.highcharts.model._
highcharts(bank
    .series("x" -> "age", "y" -> count("*"))
    .orderBy(col("age"))
  )
  .chart(Chart.column)
  .plotOptions(new plotOptions.Column().groupPadding(0).pointPadding(0).borderWidth(0))
  .plot()



来源:https://stackoverflow.com/questions/38323164/using-plotly-with-zeppellin-in-scala

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!