scala

Iterate each row in a dataframe, store it in val and pass as parameter to Spark SQL query

浪尽此生 提交于 2021-02-07 08:44:35
问题 I am trying to fetch rows from a lookup table (3 rows and 3 columns) and iterate row by row and pass values in each row to a SPARK SQL as parameters. DB | TBL | COL ---------------- db | txn | ID db | sales | ID db | fee | ID I tried this in spark shell for one row, it worked. But I am finding it difficult to iterate over rows. val sqlContext = new org.apache.spark.sql.SQLContext(sc) val db_name:String = "db" val tbl_name:String = "transaction" val unique_col:String = "transaction_number" val

Iterate each row in a dataframe, store it in val and pass as parameter to Spark SQL query

此生再无相见时 提交于 2021-02-07 08:43:23
问题 I am trying to fetch rows from a lookup table (3 rows and 3 columns) and iterate row by row and pass values in each row to a SPARK SQL as parameters. DB | TBL | COL ---------------- db | txn | ID db | sales | ID db | fee | ID I tried this in spark shell for one row, it worked. But I am finding it difficult to iterate over rows. val sqlContext = new org.apache.spark.sql.SQLContext(sc) val db_name:String = "db" val tbl_name:String = "transaction" val unique_col:String = "transaction_number" val

Scala - parameter of type T or => T

拜拜、爱过 提交于 2021-02-07 08:16:20
问题 Is there any difference between the following def foo(s: String) = { ... } and def foo(s: => String) { ... } both these definitions accept "sss" as parameter. 回答1: An argument String is a by-value parameter, => String is a by-name parameter. In the first case, the string is passed in, in the second a so-called thunk which evaluates to a String whenever it is used. def stringGen: String = util.Random.nextInt().toString def byValue(s: String) = println("We have a '" + s + "' and a '" + s + "'")

Scala - parameter of type T or => T

大憨熊 提交于 2021-02-07 08:15:36
问题 Is there any difference between the following def foo(s: String) = { ... } and def foo(s: => String) { ... } both these definitions accept "sss" as parameter. 回答1: An argument String is a by-value parameter, => String is a by-name parameter. In the first case, the string is passed in, in the second a so-called thunk which evaluates to a String whenever it is used. def stringGen: String = util.Random.nextInt().toString def byValue(s: String) = println("We have a '" + s + "' and a '" + s + "'")

Mocking SparkSession for unit testing

本小妞迷上赌 提交于 2021-02-07 08:15:24
问题 I have a method in my spark application that loads the data from a MySQL database. the method looks something like this. trait DataManager { val session: SparkSession def loadFromDatabase(input: Input): DataFrame = { session.read.jdbc(input.jdbcUrl, s"(${input.selectQuery}) T0", input.columnName, 0L, input.maxId, input.parallelism, input.connectionProperties) } } The method does nothing else other than executing jdbc method and loads data from the database. How can I test this method? The

Mocking SparkSession for unit testing

余生颓废 提交于 2021-02-07 08:14:03
问题 I have a method in my spark application that loads the data from a MySQL database. the method looks something like this. trait DataManager { val session: SparkSession def loadFromDatabase(input: Input): DataFrame = { session.read.jdbc(input.jdbcUrl, s"(${input.selectQuery}) T0", input.columnName, 0L, input.maxId, input.parallelism, input.connectionProperties) } } The method does nothing else other than executing jdbc method and loads data from the database. How can I test this method? The

How are nested functions and lexical scope compiled in JVM languages?

梦想与她 提交于 2021-02-07 08:12:28
问题 As a concrete example for my question, here's a snippet in Python (which should be readable to the broadest number of people and which has a JVM implementation anyway): def memo(f): cache = {} def g(*args): if args not in cache: cache[args] = f(*args) return cache[args] return g How do industrial-strength languages compile a definition like this, in order to realize static scope? What if we only have nested definition but no higher order function-value parameters or return values, à la Pascal

How are nested functions and lexical scope compiled in JVM languages?

柔情痞子 提交于 2021-02-07 08:11:33
问题 As a concrete example for my question, here's a snippet in Python (which should be readable to the broadest number of people and which has a JVM implementation anyway): def memo(f): cache = {} def g(*args): if args not in cache: cache[args] = f(*args) return cache[args] return g How do industrial-strength languages compile a definition like this, in order to realize static scope? What if we only have nested definition but no higher order function-value parameters or return values, à la Pascal

How to solve “Can't assign requested address: Service 'sparkDriver' failed after 16 retries” when running spark code?

半世苍凉 提交于 2021-02-07 07:49:50
问题 I am learning spark + scala with intelliJ , started with below small piece of code import org.apache.spark.{SparkConf, SparkContext} object ActionsTransformations { def main(args: Array[String]): Unit = { //Create a SparkContext to initialize Spark val conf = new SparkConf() conf.setMaster("local") conf.setAppName("Word Count") val sc = new SparkContext(conf) val numbersList = sc.parallelize(1.to(10000).toList) println(numbersList) } } when trying to run , getting below exception Exception in

Curried function in scala

泄露秘密 提交于 2021-02-07 07:14:20
问题 I have a definition of next methods: def add1(x: Int, y: Int) = x + y def add2(x: Int)(y: Int) = x + y the second one is curried version of first one. Then if I want to partially apply second function I have to write val res2 = add2(2) _ . Everything is fine. Next I want add1 function to be curried. I write val curriedAdd = (add1 _).curried Am I right that curriedAdd is similiar to add2 ? But when I try to partially apply curriedAdd in a such way val resCurried = curriedAdd(4) _ I get a