udf

MaxCompute用户初体验

瘦欲@ 提交于 2019-12-11 15:11:09
【推荐】2019 Java 开发者跳槽指南.pdf(吐血整理) >>> 作为一名初次使用MaxCompute的用户,我体会颇深。MaxCompute 开箱即用,拥有集成化的操作界面,你不必关心集群搭建、配置和运维工作。仅需简单的点击鼠标,几步操作,就可以在MaxCompute中上传数据,分析数据并得到分析结果。 作为一种快速、完全托管的 TB/PB 级数据仓库解决方案,MaxCompute不仅为我们提供了传统的命令行操作,而且提供了丰富的web端操作界面。对于数据开发,测试,发布,数据流,数据权限管理都非常好用,支持python,java的udf,对于复杂的逻辑查询也支持传统的MapReduce,同时也支持多种机器学习算法,非常好用。 MaxCompute为我们提供了统一的项目管理。实际开发中各个团队都有自己的项目,自己管理自己的项目, 通过项目隔离,有效的防止数据和任务被其他团队修改和删除等问题。除非是pro项目任务出错,否则不会影响到其他业务线的任务,最大程度降低各业务间的影响。 同时,大数据开发套件和 MaxCompute关系紧密,大数据开发套件为 MaxCompute 提供了一站式的数据同步,任务开发,数据工作流开发,数据管理和数据运维等功能。 当需要处理的数据变得非常多,并且数据发展到足够复杂的时候,这些数据往往需要用不同的模式进行处理,除此之外

Hadoop Hive UDF with external library

可紊 提交于 2019-12-11 13:47:37
问题 I'm trying to write a UDF for Hadoop Hive, that parses User Agents. Following code works fine on my local machine, but on Hadoop I'm getting: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public java.lang.String MyUDF .evaluate(java.lang.String) throws org.apache.hadoop.hive.ql.metadata.HiveException on object MyUDF@64ca8bfb of class MyUDF with arguments {All Occupations:java.lang.String} of size 1', Code: import java.io.IOException; import org.apache.hadoop.hive

Slight adaptation of a User Defined Function

别说谁变了你拦得住时间么 提交于 2019-12-11 08:58:58
问题 I would like to extract a combination of text and numbers from a larger string located within a column within excel. The constants I have to work with is that each Text string will •either start with a A, C or S, and •will always be 7 Characters long •the position of he string I would like to extract varies The code I have been using which has been working efficiently is; Public Function Xtractor(r As Range) As String Dim a, ary ary = Split(r.Text, " ") For Each a In ary If Len(a) = 7 And a

Excel VBA UDF for concatenating is giving an Error message

馋奶兔 提交于 2019-12-11 07:17:13
问题 I'm trying to write a User Defined Function (UDF) in Excel that will take the values in a range of cells, and concatenate them in a certain way. Specifically, I want to concatenate them in a way that the resulting string could be pasted into a SQL "in" function - i.e. if I have a range in Excel that contains: apples oranges pears I want the UDF to result in 'apples', 'oranges', 'pears' (i.e. no comma after the last value). This is my code - it compiles OK in the VBA window, but when I use it

Phoenix udf not working

烂漫一生 提交于 2019-12-11 06:17:25
问题 I am trying to run a custom udf in apache phoenix but getting error. Please help me to figure out the issue. Following is my function class: package co.abc.phoenix.customudfs; import org.apache.hadoop.hbase.io.ImmutableBytesWritable; import org.apache.phoenix.expression.Expression; import org.apache.phoenix.expression.function.ScalarFunction; import org.apache.phoenix.parse.FunctionParseNode.Argument; import org.apache.phoenix.parse.FunctionParseNode.BuiltInFunction; import org.apache.phoenix

Passing a list of tuples as a parameter to a spark udf in scala

只愿长相守 提交于 2019-12-10 16:17:08
问题 I am trying to pass a list of tuples to a udf in scala. I am not sure how to exactly define the datatype for this. I tried to pass it as a whole row but it can't really resolve it. I need to sort the list based on the first element of the tuple and then send n number of elements back. I have tried the following definitions for the udf def udfFilterPath = udf((id: Long, idList: Array[structType[Long, String]] ) def udfFilterPath = udf((id: Long, idList: Array[Tuple2[Long, String]] ) def

C# clr udf for Active Directory group membership

≯℡__Kan透↙ 提交于 2019-12-08 18:48:27
My problem is as follows: I need a clr udf (in C#) to query with a given ad-usr the ad-group membership using System; using System.Data; using System.Data.SqlClient; using System.Data.SqlTypes; using Microsoft.SqlServer.Server; using System.DirectoryServices.AccountManagement; public partial class UserDefinedFunctions { [Microsoft.SqlServer.Server.SqlFunction] public static SqlInt32 check_user_is_part_of_ad_grp(SqlString ad_usr, SqlString ad_grp) { bool bMemberOf = false; // set up domain context PrincipalContext ctx = new PrincipalContext(ContextType.Domain); // find the group in question

C# clr udf for Active Directory group membership

烈酒焚心 提交于 2019-12-08 05:38:04
问题 My problem is as follows: I need a clr udf (in C#) to query with a given ad-usr the ad-group membership using System; using System.Data; using System.Data.SqlClient; using System.Data.SqlTypes; using Microsoft.SqlServer.Server; using System.DirectoryServices.AccountManagement; public partial class UserDefinedFunctions { [Microsoft.SqlServer.Server.SqlFunction] public static SqlInt32 check_user_is_part_of_ad_grp(SqlString ad_usr, SqlString ad_grp) { bool bMemberOf = false; // set up domain

Trying to turn a blob into multiple columns in Spark

帅比萌擦擦* 提交于 2019-12-08 00:05:35
问题 I have a serialized blob and a function that converts it into a java Map. I have registered the function as a UDF and tried to use it in Spark SQL as follows: sqlCtx.udf.register("blobToMap", Utils.blobToMap) val df = sqlCtx.sql(""" SELECT mp['c1'] as c1, mp['c2'] as c2 FROM (SELECT *, blobToMap(payload) AS mp FROM t1) a """) I do succeed in doing it, but for some reason the very heavy blobToMap function runs twice for every row, and in reality I extract 20 fields and it runs 20 times for

Spark SQL: How to call UDF from DataFrame operation using JAVA

白昼怎懂夜的黑 提交于 2019-12-06 11:21:32
问题 I would like to know how to call UDF function from function of domain-specific language(DSL) in Spark SQL using JAVA. I have UDF function (just for example): UDF2 equals = new UDF2<String, String, Boolean>() { @Override public Boolean call(String first, String second) throws Exception { return first.equals(second); } }; I've registered it to sqlContext sqlContext.udf().register("equals", equals, DataTypes.BooleanType); When I run following query, my UDF is called and I get a result.