How to rename an existing Spark SQL function

和自甴很熟 提交于 2019-12-13 02:47:35

问题


I am using Spark to call functions on the data which is submitted by the user.

How can I rename an already existing function to a different name like like REGEXP_REPLACE to REPLACE?

I tried the following code :

ss.udf.register("REPLACE", REGEXP_REPLACE)           // This doesn't work
ss.udf.register("sum_in_all", sumInAll)
ss.udf.register("mod", mod)
ss.udf.register("average_in_all", averageInAll)

回答1:


Import it with an alias :

import org.apache.spark.sql.functions.{regexp_replace => replace }
df.show
+---+
| id|
+---+
|  0|
|  1|
|  2|
|  3|
|  4|
|  5|
|  6|
|  7|
|  8|
|  9|
+---+

df.withColumn("replaced", replace($"id", "(\\d)" , "$1+1") ).show

+---+--------+
| id|replaced|
+---+--------+
|  0|     0+1|
|  1|     1+1|
|  2|     2+1|
|  3|     3+1|
|  4|     4+1|
|  5|     5+1|
|  6|     6+1|
|  7|     7+1|
|  8|     8+1|
|  9|     9+1|
+---+--------+

To do it with Spark SQL, you'll have to re-register the function in Hive with a different name :

sqlContext.sql(" create temporary function replace 
                 as 'org.apache.hadoop.hive.ql.udf.UDFRegExpReplace' ")

sqlContext.sql(""" select replace("a,b,c", "," ,".") """).show
+-----+
|  _c0|
+-----+
|a.b.c|
+-----+


来源:https://stackoverflow.com/questions/47747980/how-to-rename-an-existing-spark-sql-function

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!