user-defined-functions

Bond CF for loop and if else loop

跟風遠走 提交于 2021-02-11 07:32:54
问题 I am trying to add the last cash flow back into the par value using a if/else loop, but I can't seem to do it. How do I assign an int to a specific item in the range? I am trying to make it so if the index > 10, it will add the par value back in. par = 1000 coupon_rate = 3 T = 5 freq = 2 def cf_calculator(par, r, T, freq): for i in range(T * freq): if (T) < (T * freq): coupon = (r/100) * par/freq print(coupon) else: coupon = (r/100) * par/freq + par print(coupon) print(cf_calculator(1000,3,5

Logger is not working inside spark UDF on cluster

陌路散爱 提交于 2021-02-10 15:54:51
问题 I have placed log.info statements inside my UDF but it is getting failed on cluster. Local working fine. Here is the snippet: def relType = udf((colValue: String, relTypeV: String) => { var relValue = "NA" val relType = relTypeV.split(",").toList val relTypeMap = relType.map { col => val split = col.split(":") (split(0), split(1)) }.toMap // val keySet = relTypeMap relTypeMap.foreach { x => if ((x._1 != null || colValue != null || x._1.trim() != "" || colValue.trim() != "") && colValue

Calculate UDF once

匆匆过客 提交于 2021-02-08 10:00:12
问题 I want to have a UUID column in a pyspark dataframe that is calculated only once, so that I can select the column in a different dataframe and have the UUIDs be the same. However, the UDF for the UUID column is recalculated when I select the column. Here's what I'm trying to do: >>> uuid_udf = udf(lambda: str(uuid.uuid4()), StringType()) >>> a = spark.createDataFrame([[1, 2]], ['col1', 'col2']) >>> a = a.withColumn('id', uuid_udf()) >>> a.collect() [Row(col1=1, col2=2, id='5ac8f818-e2d8-4c50

How to find ASCII of every character in string using DB2 Function?

♀尐吖头ヾ 提交于 2021-02-08 08:52:20
问题 I have written a function in DB2 - that is calculating ASCII of records in a particular column. I want to some help as I want to check the ASCII of every single character in string return yes if the ASCII of that record is greater than 127. BEGIN ATOMIC DECLARE POS, INT; IF INSTR IS NULL THEN RETURN NULL; END IF; SET ( POS, LEN )=( 1, LENGTH(INSTR) ); WHILE POS <= LEN DO IF ASCII( SUBSTR( INSTR, POS, 1 ))> 128 THEN RETURN 'Y'; END IF; SET POS = POS + 1; END WHILE; RETURN 'N'; 回答1: Why to

Merge Maps in scala dataframe

南楼画角 提交于 2021-02-08 08:32:30
问题 I have a dataframe with columns col1,col2,col3. col1,col2 are strings. col3 is a Map[String,String] defined below |-- col3: map (nullable = true) | |-- key: string | |-- value: string (valueContainsNull = true) I have grouped by col1,col2 and aggregated using collect_list to get an Array of Maps and stored in col4. df.groupBy($"col1", $"col2").agg(collect_list($"col3").as("col4")) |-- col4: array (nullable = true) | |-- element: map (containsNull = true) | | |-- key: string | | |-- value:

Identity column in SQL CLR Split UDF

扶醉桌前 提交于 2021-02-07 20:45:49
问题 How can I return a identity column with a standard SQL CLR Split UDF for example the code below will return a table with the string value split by delimiter, I need to somehow return an identity column as well. <SqlFunction(FillRowMethodName:="FillRow", TableDefinition:="value nvarchar(4000)")> _ Public Shared Function GetStrings(ByVal str As SqlString, ByVal delimiter As SqlString) As IEnumerable If (str.IsNull OrElse delimiter.IsNull) Then Return Nothing Else Return str.Value.Split(CChar

Table Valued Function and Entity Framework

时光毁灭记忆、已成空白 提交于 2021-02-06 08:47:48
问题 I'm trying to execute an TVF with Entity Framework and for some reason it just doesn't work. Maybe anyone out there can help me see the problem. Here are the code samples: That's the function: CREATE FUNCTION [dbo].[udf_profileSearch] (@keywords NVARCHAR(3000)) RETURNS @results TABLE ( [Id] [int] NULL, [SubCategoryId] [int] NULL, [UserId] [int] NULL, [SmallDescription] [nvarchar](250) NULL, [DetailedDescription] [nvarchar](500) NULL, [Graduation] [nvarchar](140) NULL, [Experience] [nvarchar]

Table Valued Function and Entity Framework

只谈情不闲聊 提交于 2021-02-06 08:47:18
问题 I'm trying to execute an TVF with Entity Framework and for some reason it just doesn't work. Maybe anyone out there can help me see the problem. Here are the code samples: That's the function: CREATE FUNCTION [dbo].[udf_profileSearch] (@keywords NVARCHAR(3000)) RETURNS @results TABLE ( [Id] [int] NULL, [SubCategoryId] [int] NULL, [UserId] [int] NULL, [SmallDescription] [nvarchar](250) NULL, [DetailedDescription] [nvarchar](500) NULL, [Graduation] [nvarchar](140) NULL, [Experience] [nvarchar]

How to solve pyspark `org.apache.arrow.vector.util.OversizedAllocationException` error by increasing spark's memory?

半腔热情 提交于 2021-02-04 18:59:28
问题 I'm running a job in pyspark where I at one point use a grouped aggregate Pandas UDF. This results in the following (here abbreviate) error: org.apache.arrow.vector.util.OversizedAllocationException: Unable to expand the buffer I'm fairly sure this is because one of the groups the pandas UDF receives is huge, and if I reduce the dataset and removes enough rows I can run my UDF with no problems. However, I want to run with my original dataset and even if I run this spark job on a machine with

How to solve pyspark `org.apache.arrow.vector.util.OversizedAllocationException` error by increasing spark's memory?

丶灬走出姿态 提交于 2021-02-04 18:58:12
问题 I'm running a job in pyspark where I at one point use a grouped aggregate Pandas UDF. This results in the following (here abbreviate) error: org.apache.arrow.vector.util.OversizedAllocationException: Unable to expand the buffer I'm fairly sure this is because one of the groups the pandas UDF receives is huge, and if I reduce the dataset and removes enough rows I can run my UDF with no problems. However, I want to run with my original dataset and even if I run this spark job on a machine with