user-defined-functions

How can I pass extra parameters to UDFs in Spark SQL?

…衆ロ難τιáo~ 提交于 2019-12-17 04:01:28
问题 I want to parse the date columns in a DataFrame , and for each date column, the resolution for the date may change (i.e. 2011/01/10 => 2011 /01 if the resolution is set to "Month"). I wrote the following code: def convertDataFrame(dataframe: DataFrame, schema : Array[FieldDataType], resolution: Array[DateResolutionType]) : DataFrame = { import org.apache.spark.sql.functions._ val convertDateFunc = udf{(x:String, resolution: DateResolutionType) => SparkDateTimeConverter.convertDate(x,

How can I pass extra parameters to UDFs in Spark SQL?

百般思念 提交于 2019-12-17 04:01:11
问题 I want to parse the date columns in a DataFrame , and for each date column, the resolution for the date may change (i.e. 2011/01/10 => 2011 /01 if the resolution is set to "Month"). I wrote the following code: def convertDataFrame(dataframe: DataFrame, schema : Array[FieldDataType], resolution: Array[DateResolutionType]) : DataFrame = { import org.apache.spark.sql.functions._ val convertDateFunc = udf{(x:String, resolution: DateResolutionType) => SparkDateTimeConverter.convertDate(x,

Multi-statement Table Valued Function vs Inline Table Valued Function

…衆ロ難τιáo~ 提交于 2019-12-17 02:27:11
问题 A few examples to show, just incase: Inline Table Valued CREATE FUNCTION MyNS.GetUnshippedOrders() RETURNS TABLE AS RETURN SELECT a.SaleId, a.CustomerID, b.Qty FROM Sales.Sales a INNER JOIN Sales.SaleDetail b ON a.SaleId = b.SaleId INNER JOIN Production.Product c ON b.ProductID = c.ProductID WHERE a.ShipDate IS NULL GO Multi Statement Table Valued CREATE FUNCTION MyNS.GetLastShipped(@CustomerID INT) RETURNS @CustomerOrder TABLE (SaleOrderID INT NOT NULL, CustomerID INT NOT NULL, OrderDate

Multi-statement Table Valued Function vs Inline Table Valued Function

耗尽温柔 提交于 2019-12-17 02:26:05
问题 A few examples to show, just incase: Inline Table Valued CREATE FUNCTION MyNS.GetUnshippedOrders() RETURNS TABLE AS RETURN SELECT a.SaleId, a.CustomerID, b.Qty FROM Sales.Sales a INNER JOIN Sales.SaleDetail b ON a.SaleId = b.SaleId INNER JOIN Production.Product c ON b.ProductID = c.ProductID WHERE a.ShipDate IS NULL GO Multi Statement Table Valued CREATE FUNCTION MyNS.GetLastShipped(@CustomerID INT) RETURNS @CustomerOrder TABLE (SaleOrderID INT NOT NULL, CustomerID INT NOT NULL, OrderDate

How to take formula inputs to a UDF like conditional formatting

与世无争的帅哥 提交于 2019-12-14 03:23:02
问题 Imagine you want to check whether the left letter of each word in a range is "a", then join the words for which that condition is true. One way is with a helper column, returning "" if not true, the word if it begins with "a", and then a total row which CONCAT() s over the helper column. Another way would be to use an array formula. {=CONCAT(IF(LEFT(range) = "a", range, ""))} . That's effectively using a helper column anyway. But what I want is to use the conditional formatting approach: When

Using user-defined functions within “curve” function in R graphics

送分小仙女□ 提交于 2019-12-13 21:14:17
问题 I am needing to produce normally distributed density plots with different total areas (summing to 1). Using the following function, I can specify the lambda - which gives the relative area: sdnorm <- function(x, mean=0, sd=1, lambda=1){lambda*dnorm(x, mean=mean, sd=sd)} I then want to plot up the function using different parameters. Using ggplot2, this code works: require(ggplot2) qplot(x, geom="blank") + stat_function(fun=sdnorm,args=list(mean=8,sd=2,lambda=0.7)) + stat_function(fun=sdnorm

how to interleaving lists [duplicate]

不羁的心 提交于 2019-12-13 21:03:40
问题 This question already has answers here : How to elegantly interleave two lists of uneven length in python? (7 answers) Closed 6 years ago . I have two lists that could be not equal in lengths and I want to be able to interleave them. I want to be able to append the extra values in the longer list at the end of my interleaved list.I have this: def interleave(xs,ys): a=xs b=ys minlength=[len(a),len(b)] extralist= list() interleave= list() for i in range((minval(minlength))): pair=a[i],b[i]

Malfunction of Excel-VBA own function when open other excel-file

倖福魔咒の 提交于 2019-12-13 20:58:47
问题 I have a malfunction of my own Excel-VBA function and I don't know why. I want to apply one polynomial or another (which coeficients are calculated in one sheet depending on some rules) depending on the value of the input parameter of the function CONVERTemf(E): In the excel-sheet I have cells named: "coef0_1", "coef1_1", "coef2_1", "Emin_1", "Emax_1" [for the first polynomial]; "coef0_2", "coef1_2", "coef2_2", "Emin_2", "Emax_2" [for the second one]. If "E" is between "Emin_1" and "Emax_1"

How to return an array of struct or class from UDF into dataframe column value?

对着背影说爱祢 提交于 2019-12-13 18:50:00
问题 d = [{'ID': '1', 'pID': 1000, 'startTime':'2018.07.02T03:34:20', 'endTime':'2018.07.03T02:40:20'}, {'ID': '1', 'pID': 1000, 'startTime':'2018.07.02T03:45:20', 'endTime':'2018.07.03T02:50:20'}, {'ID': '2', 'pID': 2000, 'startTime':'2018.07.02T03:34:20', 'endTime':'2018.07.03T02:40:20'}, {'ID': '2', 'pID': 2000, 'startTime':'2018.07.02T03:45:20', 'endTime':'2018.07.03T02:50:20'}] df = spark.createDataFrame(d) Dates = namedtuple("Dates", "startTime endTime") def MergeAdjacentUsage(timeSets):

Spark - How to apply a udf over single field in a Seq[Map<String,String>]

巧了我就是萌 提交于 2019-12-13 18:07:56
问题 I have a Dataframe with two columns of types String and Seq[Map[String, String]]. Something like: Name Contact Alan [(Map(number -> 12345 , type -> home)), (Map(number -> 87878787 , type -> mobile))] Ben [(Map(number -> 94837593 , type -> job)),(Map(number -> 346 , type -> home))] So what I need is to apply a udf over the field number in each Map[String,String] o each element in the array. This udf will basically convert into 0000 any number which length is less than 6. Something like this: