scala | 易学教程

Spark Streaming - java.lang.NoSuchMethodError Error

阅读更多关于 Spark Streaming - java.lang.NoSuchMethodError Error

问题 I am trying to access the streaming tweets from Spark Streaming. This is the software configuration. Ubuntu 14.04.2 LTS scala -version Scala code runner version 2.11.7 -- Copyright 2002-2013, LAMP/EPFL spark-submit --version Spark version 1.6.0 Following is the code. object PrintTweets { def main(args: Array[String]) { // Configure Twitter credentials using twitter.txt setupTwitter() // Set up a Spark streaming context named "PrintTweets" that runs locally using // all CPU cores and one

How do you create a table in Cassandra using phantom for Scala?

阅读更多关于 How do you create a table in Cassandra using phantom for Scala?

问题 I am trying to run the example on https://github.com/websudos/phantom/blob/develop/phantom-example/src/main/scala/com/websudos/phantom/example/basics/SimpleRecipes.scala ,So I created a Recipe and tried to insert it using insertNewRecord(myRecipe) and got the following exception: ....InvalidQueryException: unconfigured columnfamily my_custom_table . I checked using cqlsh and the keyspace was created but the table was not. So my question is, how do I create the table using phantom? This is

Spark: Dataframe action really slow when upgraded from 2.1.0 to 2.2.1

阅读更多关于 Spark: Dataframe action really slow when upgraded from 2.1.0 to 2.2.1

问题 I just upgraded spark 2.1.0 to spark 2.2.1. Has anyone seen extreme slow behavior on dataframe.filter(…).collect() ?.. specifically a collect operation with filter before. dataframe.collect seems to run okay. However, dataframe.filter(…).collect() takes forever. it contains only 2 records. and its on a unit test. When I go back to spark 2.1.0, its back to normal speed I have looked at the thread dump and could not find an obvious cause. I have made an effort to make sure all the libraries I

Spark: Dataframe action really slow when upgraded from 2.1.0 to 2.2.1

阅读更多关于 Spark: Dataframe action really slow when upgraded from 2.1.0 to 2.2.1

java.lang.NumberFormatException: For input string: “0.000” [duplicate]

阅读更多关于 java.lang.NumberFormatException: For input string: “0.000” [duplicate]

问题 This question already has answers here : What is a NumberFormatException and how can I fix it? (9 answers) Closed 1 year ago . I am trying to create a udf to take two strings as parameters; one in DD-MM-YYYY format (e.g. "14-10-2019") and the other in float format (e.g. "0.000"). I want to convert the float-like string to an int and add it to the date object to get another date which I want to return as a string. def getEndDate = udf{ (startDate: String, no_of_days : String) => val num_days =

java.lang.NumberFormatException: For input string: “0.000” [duplicate]

阅读更多关于 java.lang.NumberFormatException: For input string: “0.000” [duplicate]

Transforming an iterator into an iterator of chunks of duplicates

阅读更多关于 Transforming an iterator into an iterator of chunks of duplicates

问题 Suppose I am writing a function foo: Iterator[A] => Iterator[List[A]] to transform a given iterator into an iterator of chunks of duplicates : def foo[T](it: Iterator[A]): Iterator[List[A]] = ??? foo("abbbcbbe".iterator).toList.map(_.mkString) // List("a", "bbb", "c", "bb", "e") In order to implement foo I want to reuse function splitDupes: Iterator[A] => (List[A], Iterator[A]) that splits an iterator into a prefix with duplicates and the rest (thanks a lot to Kolmar who suggested it here)

Transforming an iterator into an iterator of chunks of duplicates

阅读更多关于 Transforming an iterator into an iterator of chunks of duplicates

Flink Table API & SQL and map types (Scala)

阅读更多关于 Flink Table API & SQL and map types (Scala)

问题 I am using Flink's Table API and/or Flink's SQL support (Flink 1.3.1, Scala 2.11) in a streaming environment. I'm starting with a DataStream[Person] , and Person is a case class that looks like: Person(name: String, age: Int, attributes: Map[String, String]) All is working as expected until I start to bring attributes into the picture. For example: val result = streamTableEnvironment.sql( """ |SELECT |name, |attributes['foo'], |TUMBLE_START(rowtime, INTERVAL '1' MINUTE) |FROM myTable |GROUP

sbt/sbt : no such file or directory error

阅读更多关于 sbt/sbt : no such file or directory error

问题 I'm trying to install spark in my ubuntu machine. I have installed sbt and scala. I'm able to view their versions. But, when I try to install spark using 'sbt/sbt assembly' command, i get the below error. 'bash: sbt/sbt: No such file or directory' Can you please let me know where I am making a mistake. I have been stuck here since yesterday. Thank you for the help in advance. 回答1: You may had downloaded the pre-built version of Spark. If its a pre-built you dont need to execute built tool