apache-zeppelin | 易学教程

Field “features” does not exist. SparkML

阅读更多关于 Field “features” does not exist. SparkML

I am trying to build a model in Spark ML with Zeppelin. I am new to this area and would like some help. I think i need to set the correct datatypes to the column and set the first column as the label. Any help would be greatly appreciated, thank you val training = sc.textFile("hdfs:///ford/fordTrain.csv") val header = training.first val inferSchema = true val df = training.toDF val lr = new LogisticRegression() .setMaxIter(10) .setRegParam(0.3) .setElasticNetParam(0.8) val lrModel = lr.fit(df) // Print the coefficients and intercept for multinomial logistic regression println(s"Coefficients:

Hello world in zeppelin failed

阅读更多关于 Hello world in zeppelin failed

I just installed apache zeppelin (built from latest source from git repo) and successfully saw it is up and running in the port 10008. I created a new note book with a single line of code val a = "Hello World!" And run this paragraph and saw the below error java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java

Is it possible to set global variables in a Zeppelin Notebook?

阅读更多关于 Is it possible to set global variables in a Zeppelin Notebook?

问题 I'm trying to create a multi-paragraph dashboard using a Zeppelin notebook. I'd like people using the dashboard to only have to enter certain parameters once. E.g. if I'm making a dashboard with information about different websites, the dashboard user only has to select the particular website they want information about once and the whole multi-paragraph dashboard will update. Is this possible? How do I set global variables like this in a notebook? To clarify, the parameter input that I

apache zeppelin is started but there is connection error in localhost:8080

阅读更多关于 apache zeppelin is started but there is connection error in localhost:8080

after successfully build apache zepellin on Ubuntu 14, I start zeppelin and it says successfully started but when I go to localhost:8080 Firefox shows unable to connect error like it didn't started but when I check Zeppelin status from terminal it says running and also I just copied config files templates so the config files are the default update changed the port to 8090 , here is the config file , but no change in result <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>zeppelin.server.addr</name> <value>0.0.0.0</value>

apache zeppelin throwing NullPointerException error

阅读更多关于 apache zeppelin throwing NullPointerException error

I am new to zeppelin and trying to setup the zeppelin on my system. Till now I have done the following steps: Downloaded zeppelin from here Setup the JAVA_HOME at my system environment variable. Goto zeppelin-0.7.3-bin-all\bin and ran zeppelin.cmd Able to see zeppelin-ui at http://localhost:8090 When I am trying to run load data into table program mentioned in zeppelin tutotial -> Basic Features(spark) it is throwing following error java.lang.NullPointerException at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:38) at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:33) at org

How to use dependencies from S3 in Zeppelin?

阅读更多关于 How to use dependencies from S3 in Zeppelin?

Is there a way to add jars that are in a bucket on S3 as a dependency of Zeppelin? tried z.load(s3n://...) and z.addRepo(some_name).url(s3n://...) but they don't seem to do the job.. You could download jars from S3 and put it on the local FS. It could be done inside %dep interpreter like this: %dep import com.amazonaws.services.s3.AmazonS3Client import java.io.File import java.nio.file.{Files, StandardCopyOption} val dest = "/tmp/dependency.jar" val s3 = new AmazonS3Client() val stream = s3.getObject("buckename", "path.jar").getObjectContent Files.copy(stream, new File(dest).toPath,

Moving Spark DataFrame from Python to Scala whithn Zeppelin

阅读更多关于 Moving Spark DataFrame from Python to Scala whithn Zeppelin

问题 I created a spark DataFrame in a Python paragraph in Zeppelin. sqlCtx = SQLContext(sc) spDf = sqlCtx.createDataFrame(df) and df is a pandas dataframe print(type(df)) <class 'pandas.core.frame.DataFrame'> what I want to do is moving spDf from one Python paragraph to another Scala paragraph. It look a reasonable way to do is using z.put . z.put("spDf", spDf) and I got this error: AttributeError: 'DataFrame' object has no attribute '_get_object_id' Any suggestion to fix the error? Or any

Zeppelin Dynamic Form Drop Down value in SQL

阅读更多关于 Zeppelin Dynamic Form Drop Down value in SQL

I have a dropdown element in my Zeppelin notebook val instrument = z.select("Select Item", Seq(("A", "1"),("B", "2"),("C", "3"))) I want to use the value of this variable instrument in my sql. For e.g., my next paragraph in the notebook contains %sql select * from table_name where item='<<instrument selected above>>' Is this possible? If yes, what would the syntax look like? This is completely possible and here is an example with both %spark and %sql interpreters : cell 1: val df = Seq((1,2,"A"),(3,4,"B"),(3,2,"B")).toDF("x","y","item") df.registerTempTable("table_name") val instrument = z

converting pandas dataframes to spark dataframe in zeppelin

阅读更多关于 converting pandas dataframes to spark dataframe in zeppelin

问题 I am new to zeppelin. I have a usecase wherein i have a pandas dataframe.I need to visualize the collections using in-built chart of zeppelin I do not have a clear approach here. MY understanding is with zeppelin we can visualize the data if it is a RDD format. So, i wanted to convert to pandas dataframe into spark dataframe, and then do some querying (using sql), I will visualize. To start with, I tried to convert pandas dataframe to spark's but i failed %pyspark import pandas as pd from

Configure Zeppelin's Spark Interpreter on EMR when starting a cluster

阅读更多关于 Configure Zeppelin's Spark Interpreter on EMR when starting a cluster

I am creating clusters on EMR and configure Zeppelin to read the notebooks from S3. To do that I am using a json object that looks like that: [ { "Classification": "zeppelin-env", "Properties": { }, "Configurations": [ { "Classification": "export", "Properties": { "ZEPPELIN_NOTEBOOK_STORAGE":"org.apache.zeppelin.notebook.repo.S3NotebookRepo", "ZEPPELIN_NOTEBOOK_S3_BUCKET":"hs-zeppelin-notebooks", "ZEPPELIN_NOTEBOOK_USER":"user" }, "Configurations": [ ] } ] } ] I am pasting this object in the Stoftware configuration page of EMR: My question is, how/where I can configure the Spark interpreter