问题
I want to select few columns, add few columns or divide, with some columns as space padded and store them with new names as alias. For example in SQL should be something like:
select " " as col1, b as b1, c+d as e from table
How can I achieve this in Spark?
回答1:
You can also use the native DF functions as well. For example given:
import org.apache.spark.sql.functions._
val df1 = Seq(
("A",1,5,3),
("B",3,4,2),
("C",4,6,3),
("D",5,9,1)).toDF("a","b","c","d")
select the columns as:
df1.select(lit(" ").as("col1"),
col("b").as("b1"),
(col("c") + col("d")).as("e"))
gives you the expected result:
+----+---+---+
|col1| b1| e|
+----+---+---+
| | 1| 8|
| | 3| 6|
| | 4| 9|
| | 5| 10|
+----+---+---+
回答2:
with Spark-SQL, you can do the same way.
import org.apache.spark.sql.functions._
val df1 = Seq(
("A",1,5,3),
("B",3,4,2),
("C",4,6,3),
("D",5,9,1)).toDF("a","b","c","d")
df1.createOrReplaceTempView("table")
df1.show()
val df2 = spark.sql("select ' ' as col1, b as b1, c+d as e from table ").show()
Input:
+---+---+---+---+
| a| b| c| d|
+---+---+---+---+
| A| 1| 5| 3|
| B| 3| 4| 2|
| C| 4| 6| 3|
| D| 5| 9| 1|
+---+---+---+---+
Output :
+----+---+---+
|col1| b1| e|
+----+---+---+
| | 1| 8|
| | 3| 6|
| | 4| 9|
| | 5| 10|
+----+---+---+
来源:https://stackoverflow.com/questions/52538943/spark-select-and-add-columns-with-alias