在使用hive中,有时要根据业务需求自定义一些函数,下面是自定义函数的步骤
1.新建一个maven项目,在项目的pom文件中引入依赖
<dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-exec</artifactId> <version>3.1.2</version> </dependency>
2.新建一个class,继承UDF,并重写evaluate()方法,下面是一个增加字段前缀的方法,具体实现参考如下代码
import org.apache.hadoop.hive.ql.exec.Description; import org.apache.hadoop.hive.ql.exec.UDF; import java.util.Random; @Description( name = "min", value = "_FUNC_(expr) - add a number and '_' before the expr" ) public class AddPrefixUDF extends UDF { public String evaluate(String input){ Random random = new Random(); int num = random.nextInt(10); return num + "_" + input; } public static void main(String[] args) { AddPrefixUDF addPrefixUDF = new AddPrefixUDF(); String result = addPrefixUDF.evaluate("test"); System.out.println(result); } }
3.通过maven打包,将文件上传到linux某个路径下。
4.在hive命令中,添加jar文件并创建函数
hive (ruozedata_ba)> add jar /home/hadoop/lib/hadoop-project-1.0.jar; Added [/home/hadoop/lib/hadoop-project-1.0.jar] to class path Added resources: [/home/hadoop/lib/hadoop-project-1.0.jar] hive (ruozedata_ba)> create TEMPORARY function add_prefix as 'com.wxx.bigdata.hive.udf.AddPrefixUDF'; OK Time taken: 0.101 seconds hive (ruozedata_ba)> show functions; OK tab_name ! != % & * + - / < <= <=> <> = == > >= ^ abs acos add_months add_prefix ... hive (ruozedata_ba)> select add_prefix(platform) from platform_stat; OK _c0 4_Android 6_MAC os 1_WIN 4_iOS 8_windows mobile 6_windows phone Time taken: 0.332 seconds, Fetched: 6 row(s)
5.上述是添加了一个临时的function,当前hive的会话生效,换一个会话show functions; 则找不到这两个临时函数。
6.增加一个permanent function。
[hadoop@hadoop000 lib]$ hdfs dfs -mkdir /lib [hadoop@hadoop000 lib]$ hdfs dfs -put /home/hadoop/lib/hadoop-project-1.0.jar /lib [hadoop@hadoop000 lib]$ hdfs dfs -ls /lib Found 1 items -rw-r--r-- 1 hadoop supergroup 50187 2019-09-25 17:37 /lib/hadoop-project-1.0.jar [hadoop@hadoop000 lib]$
CREATE FUNCTION add_prefix_new AS "com.wxx.bigdata.hive.udf.AddPrefixUDF" USING JAR "hdfs://hadoop000:8020/lib/hadoop-project-1.0.jar"; CREATE FUNCTION remove_prefix_new AS "com.wxx.bigdata.hive.udf.RemovePrefixUDF" USING JAR "hdfs://hadoop000:8020/lib/hadoop-project-1.0.jar";
hive (ruozedata_ba)> select add_prefix_new(platform) from platform_stat; OK _c0 6_Android 7_MAC os 1_WIN 9_iOS 6_windows mobile 3_windows phone Time taken: 0.658 seconds, Fetched: 6 row(s) hive (ruozedata_ba)>
来源:51CTO
作者:muyingmiao
链接:https://blog.csdn.net/muyingmiao/article/details/101375234