scala | 易学教程

What is execution context in Scala?

阅读更多关于 What is execution context in Scala?

问题 I am new to Scala, and was trying to use some parallel constructs( Future in particular). I found there is an implicit parameter of type ExecutionContext . IMO, it is something similar to(and maybe more abstract than) the concept of thread pool. I have tried to learn it through documentation, but I cannot find any clear and detailed introduction about it. Could anyone please explain what exactly execution context is in Scala? And what is the purpose of introducing execution context to the

Why do `Left` and `Right` have two type parameters?

阅读更多关于 Why do `Left` and `Right` have two type parameters?

问题 I understand it would be difficult to change now without breaking existing code, but I'm wondering why it was done that way in the first place. Why not just: sealed trait Either[+A, +B] case class Left[A](x: A) extends Either[A, Nothing] case class Right[B](x: B) extends Either[Nothing, B] Is there some drawback here that I'm failing to see...? 回答1: Not sure how relevant this answer really is to Scala, but it certainly is in Haskell which is evidently where Scala's Either was borrowed from

Why do `Left` and `Right` have two type parameters?

阅读更多关于 Why do `Left` and `Right` have two type parameters?

Why do `Left` and `Right` have two type parameters?

阅读更多关于 Why do `Left` and `Right` have two type parameters?

Spark RDD和DataSet与DataFrame转换成RDD

阅读更多关于 Spark RDD和DataSet与DataFrame转换成RDD

Spark RDD和DataSet与DataFrame转换成RDD 一、什么是RDD RDD是弹性分布式数据集（ resilient distributed dataset）的简称，是一个可以参与并行操作并且可容错的元素集合。什么是并行操作呢？例如，对于一个含4个元素的数组Array，元素分别为1，2，3，4。如果现在想将数组的每个元素放大两倍，Java实现通常是遍历数组的每个元素，然后每个元素乘以2，数组中的每个元素操作是有先后顺序的。但是在Spark中，可以将数组转换成一个RDD分布式数据集，然后同时操作每个元素。二、创建RDD Spark中提供了两种方式创建RDD 首先执行 1 spark-shell 命令，打开scala终端，如图：我们使用的HDP集成好的Spark，可以自己安装Apache Spark。 1、并行化一个存在的数据集例如：将一个数组Array转换成一个RDD，如图： val data = Array(1, 2, 3, 4, 5) val distData = sc.parallelize(data) 在命令窗口执行上述命令后，如图： parallesize函数提供了两个参数，第二个参数表示RDD的分区数（partiton number），例如： scala> val distDataP = sc.parallelize(data,3)

How to display a KeyValueGroupedDataset in Spark?

阅读更多关于 How to display a KeyValueGroupedDataset in Spark?

问题 I am trying to learn datasets in Spark. One thing I can't figure out is how to display a KeyValueGroupedDataset , as show doesn't work for it. Also, what is the equivalent of a map for KeyValuGroupedDataSet ? I will appreciate if someone give some examples. 回答1: OK, I got the idea from examples given here and here. I am giving below a simple example that I've written. val x = Seq(("a", 36), ("b", 33), ("c", 40), ("a", 38), ("c", 39)).toDS x: org.apache.spark.sql.Dataset[(String, Int)] = [_1:

indows Eclipse Scala编写WordCount程序

阅读更多关于 indows Eclipse Scala编写WordCount程序

Windows Eclipse Scala编写WordCount程序： 1）无需启动hadoop，因为我们用的是本地文件。先像原来一样，做一个普通的scala项目和Scala Object。但这里一定注意版本是2.10.6,因为缺省的不好使。改的方法是：右击项目/properties/Scala Compiler. 2）像spark的java版WordCount项目一模一样导包，什么都一样。（导包的方法和原来普通的java项目一样）例：5.1 package com import org.apache.spark.SparkConf import org.apache.spark.SparkContext object WordCount { def main(args: Array[String]) { val conf = new SparkConf(); conf.setAppName("First Spark scala App!"); conf.setMaster("local"); val sc = new SparkContext(conf); val lines = sc.textFile("E://temp//input//friend.txt", 1); val words = lines.flatMap { lines => lines.split(" "

Scala 数组、映射和集合+wordcount程序

阅读更多关于 Scala 数组、映射和集合+wordcount程序

数组 1、定长数组和变长数组 package cn.gec.scala import scala.collection.mutable.ArrayBuffer object ArrayDemo { def main(args: Array[String]) { // 初始化一个长度为8的定长数组，其所有元素均为0 val arr1 = new Array[Int](8) // 直接打印定长数组，内容为数组的hashcode值 println (arr1) // 将数组转换成数组缓冲，就可以看到原数组中的内容了 //toBuffer会将数组转换长数组缓冲 println (arr1.toBuffer) // 注意：如果new，相当于调用了数组的apply方法，直接为数组赋值 //初始化一个长度为1的定长数组 val arr2 = Array [Int](10) println (arr2.toBuffer) // 定义一个长度为3的定长数组 val arr3 = Array ( "hadoop" , "storm" , "spark" ) // 使用()来访问元素 println (arr3(2)) ////////////////////////////////////////////////// // 变长数组（数组缓冲） // 如果想使用数组缓冲，需要导入import

Covid Death Predictions gone wrong [closed]

阅读更多关于 Covid Death Predictions gone wrong [closed]

问题 Closed. This question needs debugging details. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 3 days ago . Improve this question I'm attempting to write a code that will predict fatalities in Toronto due to Covid19...with no luck. I'm sure this has an easy fix that I'm over looking, but I'm too new to spark to know what that is... does anyone have any insight on making this code run-able? Data set is here

scala集合三大类(seq序列,set集,map映射)——list序列

阅读更多关于 scala集合三大类(seq序列,set集,map映射)——list序列

scala集合三大类(seq序列,set集,map映射) seq序列： scala> val list1 = List(1,2,3) list1: List[Int] = List(1, 2, 3) scala> val list2 = 0 :: list1 list2: List[Int] = List(0, 1, 2, 3) //下面两个方式效果一样， :: and + : scala> val list3 = list1.::(0) list3: List[Int] = List(0, 1, 2, 3) scala> val list4 = 0 +: list1 list4: List[Int] = List(0, 1, 2, 3) scala> val list5 = list1.+:(0) list5: List[Int] = List(0, 1, 2, 3) scala> val list5 = list1.+:(0) list5: List[Int] = List(0, 1, 2, 3) scala> val list6 = list1 :+ 4 list6: List[Int] = List(1, 2, 3, 4) scala> val list7 = List(5,6,7) list7: List[Int] = List(5, 6, 7) scala> val