scala

PySpark UDF optimization challenge using a dictionary with regex's (Scala?)

ぐ巨炮叔叔 提交于 2021-02-18 17:09:50
问题 I am trying to optimize the code below (PySpark UDF). It gives me the desired result (based on my data set) but it's too slow on very large datasets (approx. 180M). The results (accuracy) are better than available Python modules (e.g. geotext, hdx-python-country). So I'm not looking for another module. DataFrame: df = spark.createDataFrame([ ["3030 Whispering Pines Circle, Prosper Texas, US","John"], ["Kalverstraat Amsterdam","Mary"], ["Kalverstraat Amsterdam, Netherlands","Lex"] ]).toDF(

Scala play json combinators for validating equality

和自甴很熟 提交于 2021-02-18 13:18:00
问题 I'm using play 2.2.0 Reads for validating incoming request in my application. I'm trying to implement a very simple thing with json API. I have a json like this: { "field1": "some value", "field2": "some another value" } I already have Reads that checks for other stuff like minimal length case class SomeObject(field1: String, field2: String) implicit val someObjectReads = ( (__ \ "field1").read(minLength[String](3)) ~ (__ \ "field2").read(minLength[String](3)) )(SomeObject) I want to create a

Scala play json combinators for validating equality

雨燕双飞 提交于 2021-02-18 13:08:29
问题 I'm using play 2.2.0 Reads for validating incoming request in my application. I'm trying to implement a very simple thing with json API. I have a json like this: { "field1": "some value", "field2": "some another value" } I already have Reads that checks for other stuff like minimal length case class SomeObject(field1: String, field2: String) implicit val someObjectReads = ( (__ \ "field1").read(minLength[String](3)) ~ (__ \ "field2").read(minLength[String](3)) )(SomeObject) I want to create a

Scala play json combinators for validating equality

对着背影说爱祢 提交于 2021-02-18 13:04:33
问题 I'm using play 2.2.0 Reads for validating incoming request in my application. I'm trying to implement a very simple thing with json API. I have a json like this: { "field1": "some value", "field2": "some another value" } I already have Reads that checks for other stuff like minimal length case class SomeObject(field1: String, field2: String) implicit val someObjectReads = ( (__ \ "field1").read(minLength[String](3)) ~ (__ \ "field2").read(minLength[String](3)) )(SomeObject) I want to create a

scala case class copy implementation

前提是你 提交于 2021-02-18 11:40:14
问题 I can't find how copy is implemented for case class in scala. Can I check it somehow? I though Intellij could point me to implementation, but it doesn't want to jump and I have no idea why :/ 回答1: You can inspect the scala case class output using scalac -print ClassName.scala , as the copy is actually a compiler generated method. Here's a given example: case class Test(s: String, i: Int) This is the output after filtering out noise for copy : case class Test extends Object with Product with

scala case class copy implementation

旧时模样 提交于 2021-02-18 11:40:08
问题 I can't find how copy is implemented for case class in scala. Can I check it somehow? I though Intellij could point me to implementation, but it doesn't want to jump and I have no idea why :/ 回答1: You can inspect the scala case class output using scalac -print ClassName.scala , as the copy is actually a compiler generated method. Here's a given example: case class Test(s: String, i: Int) This is the output after filtering out noise for copy : case class Test extends Object with Product with

Zip elements with odd and even indices in a list

你离开我真会死。 提交于 2021-02-18 11:08:41
问题 I want to zip even and odd elements in a list to make a list of pairs, like that: ["A", "B", "C", "D", "E", "F"] -> [("A", "B"), ("C", "D"), ("E", "F")] What is the most concise expression to do this in elegant in functional way? 回答1: In 2.8, you'd probably use methods: scala> val a = "ABCDEF".toList.map(_.toString) a: List[java.lang.String] = List(A, B, C, D, E, F) scala> a.grouped(2).partialMap{ case List(a,b) => (a,b) }.toList res0: List[(java.lang.String, java.lang.String)] = List((A,B),

Why is the main function not running in the REPL?

我的梦境 提交于 2021-02-18 10:43:53
问题 This is a simple program. I expected main to run in interpreted mode. But the presence of another object caused it to do nothing. If the QSort were not present, the program would have executed. Why is main not called when I run this in the REPL? object MainObject{ def main(args: Array[String])={ val unsorted = List(8,3,1,0,4,6,4,6,5) print("hello" + unsorted toString) //val sorted = QSort(unsorted) //sorted foreach println } } //this must not be present object QSort{ def apply(array: List[Int

Can I set a timeout and number of retries on a specific pipeline request?

为君一笑 提交于 2021-02-18 10:34:41
问题 When using spray's pipelining to make an HTTP request like this: val urlpipeline = sendReceive ~> unmarshal[String] urlpipeline { Get(url) } is there a way to specify a timeout for the request and the number of times it should retry for that specific request? All the documentation I've found only references doing in a config (and even then I can't seem to get it to work). thx 回答1: With the configuration file I use Spray 1.2.0 in an Akka system. Inside my actor, I import the existing Akka

Aliasing this in scala with self =>

|▌冷眼眸甩不掉的悲伤 提交于 2021-02-18 10:27:05
问题 Some Scala APIs alias this to self , for example, trait Function1[-T1, +R] extends AnyRef { self => I know how this aliasing works in general, but don't see how traits such as Function1 benefit from it. Function1 does not use self anywhere in its definition except for the initial mention, so what is its purpose here? Variants of this question have been asked previously, but the answers are not directly applicable. Answers have discussed self types and inner classes, but I don't see how that