scala | 易学教程

How to detect Parquet files?

阅读更多关于 How to detect Parquet files?

问题 I have a script I am writing that will use either plain text or Parquet files. If it is a parquet file it will read it in using a dataframe and a few other things. On my cluster I am working on the first solution was the easiest and was if the extension of a file was .parquet if (parquetD(1) == "parquet") { if (args.length != 2) { println(usage2) System.exit(1) println(args) } } it would read it in with the dataframe. The problem is I have a bunch of files some people have created with no

How to detect Parquet files?

阅读更多关于 How to detect Parquet files?

How to skip first and last line from a dat file and make it to dataframe using scala in databricks

阅读更多关于 How to skip first and last line from a dat file and make it to dataframe using scala in databricks

问题 H|*|D|*|PA|*|BJ|*|S|*|2019.05.27 08:54:24|##| H|*|AP_ATTR_ID|*|AP_ID|*|OPER_ID|*|ATTR_ID|*|ATTR_GROUP|*|LST_UPD_USR|*|LST_UPD_TSTMP|##| 779045|*|Sar|*|SUPERVISOR HIERARCHY|*|Supervisor|*|2|*|128|*|2019.05.14 16:48:16|##| 779048|*|KK|*|SUPERVISOR HIERARCHY|*|Supervisor|*|2|*|116|*|2019.05.14 16:59:02|##| 779054|*|Nisha - A|*|EXACT|*|CustomColumnRow120|*|2|*|1165|*|2019.05.15 12:11:48|##| T|*||*|2019.05.27 08:54:28|##| file name is PA.dat. I need to skip first line and also last line of the

Error with spark Row.fromSeq for a text file

阅读更多关于 Error with spark Row.fromSeq for a text file

问题 import org.apache.log4j.{Level, Logger} import org.apache.spark.sql.SparkSession import org.apache.spark.sql.functions._ import org.apache.spark._ import org.apache.spark.sql.types._ import org.apache.spark.sql._ object fixedLength { def main(args:Array[String]) { def getRow(x : String) : Row={ val columnArray = new Array[String](4) columnArray(0)=x.substring(0,3) columnArray(1)=x.substring(3,13) columnArray(2)=x.substring(13,18) columnArray(3)=x.substring(18,22) Row.fromSeq(columnArray) }

Scala tail recursive method has an divide and remainder error

阅读更多关于 Scala tail recursive method has an divide and remainder error

问题 I'm currently computing the binomial coefficient of two natural numbers by write a tail recursion in Scala. But my code has something wrong with the dividing numbers, integer division by k like I did as that will give you a non-zero remainder and hence introduce rounding errors. So could anyone help me figure it out, how to fix it ? def binom(n: Int, k: Int): Int = { require(0 <= k && k <= n) def binomtail(n: Int, k: Int, ac: Int): Int = { if (n == k || k == 0) ac else binomtail(n - 1, k - 1,

How to register the Java SPark UDF in spark shell?

阅读更多关于 How to register the Java SPark UDF in spark shell?

问题 Below is my java udf code, package com.udf; import org.apache.spark.sql.api.java.UDF1; public class SparkUDF implements UDF1<String, String> { @Override public String call(String arg) throws Exception { if (validateString(arg)) return arg; return "INVALID"; } public static boolean validateString(String arg) { if (arg == null | arg.length() != 11) return false; else return true; } } I am building the Jar with this class as SparkUdf-1.0-SNAPSHOT.jar I am having a table name as sample in hive

How to register the Java SPark UDF in spark shell?

阅读更多关于 How to register the Java SPark UDF in spark shell?

Set current project to default-6c6f02 (in build file:/home/user_name/Videos/ [duplicate]

阅读更多关于 Set current project to default-6c6f02 (in build file:/home/user_name/Videos/ [duplicate]

问题 This question already has an answer here : Why is sbt current project name “default” in 0.10? (1 answer) Closed 6 years ago . what does it mean when running sbt command in command line in scala . Set current project to default-6c6f02 (in build file:/home/user_name/Videos/ what should i set after this statement? 回答1: This happens if you call sbt command in the folder where you don't have built. sbt or project/Build.scala , as i understand in your case it's /home/user_name/Videos/ . And because

How to set type parameter bound in scala to make generic function for numerics?

阅读更多关于 How to set type parameter bound in scala to make generic function for numerics?

问题 I want to make a sum function that works with all Numeric types. This works: object session { def mapReduce[A](f: A => A, combine: (A, A) => A, zero: A, inc: A) (a: A,b: A) (implicit num:Numeric[A]): A = { def loop(acc: A, a: A) = if (num.gt(a, b)) acc else combine(f(a), mapReduce(f, combine, zero, inc)(num.plus(a, inc), b)) loop(zero, a) } def sum(f: Int => Int) (a: Int, b: Int) : Int = { mapReduce(f, (x: Int, y: Int) => x + y, 0, 1)(a, b)} sum(x => x)(3, 4) //> res0: Int = 7 def product(f:

enable macro paradise to expand macro annotations

阅读更多关于 enable macro paradise to expand macro annotations

问题 I wanted to check some examples with annotations in macro paradise and I am getting the error, as is specified in: this example I have related the projects, the other scala macros (not with annotations) are working very well. I have included the library paradise_2.11.6-2.1.0-M5 (in both projects also :( ). I think, I do not get what means with *to enable* . !? bthw, I am using Scala IDE in Eclipse. 回答1: By enable, I meant adding it as a compiler plugin, as e.g. in https://github.com