user-defined-functions | 易学教程

How can I pass extra parameters to UDFs in Spark SQL?

阅读更多关于 How can I pass extra parameters to UDFs in Spark SQL?

问题 I want to parse the date columns in a DataFrame , and for each date column, the resolution for the date may change (i.e. 2011/01/10 => 2011 /01 if the resolution is set to "Month"). I wrote the following code: def convertDataFrame(dataframe: DataFrame, schema : Array[FieldDataType], resolution: Array[DateResolutionType]) : DataFrame = { import org.apache.spark.sql.functions._ val convertDateFunc = udf{(x:String, resolution: DateResolutionType) => SparkDateTimeConverter.convertDate(x,

How can I pass extra parameters to UDFs in Spark SQL?

阅读更多关于 How can I pass extra parameters to UDFs in Spark SQL?

Multi-statement Table Valued Function vs Inline Table Valued Function

阅读更多关于 Multi-statement Table Valued Function vs Inline Table Valued Function

问题 A few examples to show, just incase: Inline Table Valued CREATE FUNCTION MyNS.GetUnshippedOrders() RETURNS TABLE AS RETURN SELECT a.SaleId, a.CustomerID, b.Qty FROM Sales.Sales a INNER JOIN Sales.SaleDetail b ON a.SaleId = b.SaleId INNER JOIN Production.Product c ON b.ProductID = c.ProductID WHERE a.ShipDate IS NULL GO Multi Statement Table Valued CREATE FUNCTION MyNS.GetLastShipped(@CustomerID INT) RETURNS @CustomerOrder TABLE (SaleOrderID INT NOT NULL, CustomerID INT NOT NULL, OrderDate

Multi-statement Table Valued Function vs Inline Table Valued Function

阅读更多关于 Multi-statement Table Valued Function vs Inline Table Valued Function

How to take formula inputs to a UDF like conditional formatting

阅读更多关于 How to take formula inputs to a UDF like conditional formatting

问题 Imagine you want to check whether the left letter of each word in a range is "a", then join the words for which that condition is true. One way is with a helper column, returning "" if not true, the word if it begins with "a", and then a total row which CONCAT() s over the helper column. Another way would be to use an array formula. {=CONCAT(IF(LEFT(range) = "a", range, ""))} . That's effectively using a helper column anyway. But what I want is to use the conditional formatting approach: When

Using user-defined functions within “curve” function in R graphics

阅读更多关于 Using user-defined functions within “curve” function in R graphics

问题 I am needing to produce normally distributed density plots with different total areas (summing to 1). Using the following function, I can specify the lambda - which gives the relative area: sdnorm <- function(x, mean=0, sd=1, lambda=1){lambda*dnorm(x, mean=mean, sd=sd)} I then want to plot up the function using different parameters. Using ggplot2, this code works: require(ggplot2) qplot(x, geom="blank") + stat_function(fun=sdnorm,args=list(mean=8,sd=2,lambda=0.7)) + stat_function(fun=sdnorm

how to interleaving lists [duplicate]

阅读更多关于 how to interleaving lists [duplicate]

问题 This question already has answers here : How to elegantly interleave two lists of uneven length in python? (7 answers) Closed 6 years ago . I have two lists that could be not equal in lengths and I want to be able to interleave them. I want to be able to append the extra values in the longer list at the end of my interleaved list.I have this: def interleave(xs,ys): a=xs b=ys minlength=[len(a),len(b)] extralist= list() interleave= list() for i in range((minval(minlength))): pair=a[i],b[i]

Malfunction of Excel-VBA own function when open other excel-file

阅读更多关于 Malfunction of Excel-VBA own function when open other excel-file

问题 I have a malfunction of my own Excel-VBA function and I don't know why. I want to apply one polynomial or another (which coeficients are calculated in one sheet depending on some rules) depending on the value of the input parameter of the function CONVERTemf(E): In the excel-sheet I have cells named: "coef0_1", "coef1_1", "coef2_1", "Emin_1", "Emax_1" [for the first polynomial]; "coef0_2", "coef1_2", "coef2_2", "Emin_2", "Emax_2" [for the second one]. If "E" is between "Emin_1" and "Emax_1"

How to return an array of struct or class from UDF into dataframe column value?

阅读更多关于 How to return an array of struct or class from UDF into dataframe column value?

问题 d = [{'ID': '1', 'pID': 1000, 'startTime':'2018.07.02T03:34:20', 'endTime':'2018.07.03T02:40:20'}, {'ID': '1', 'pID': 1000, 'startTime':'2018.07.02T03:45:20', 'endTime':'2018.07.03T02:50:20'}, {'ID': '2', 'pID': 2000, 'startTime':'2018.07.02T03:34:20', 'endTime':'2018.07.03T02:40:20'}, {'ID': '2', 'pID': 2000, 'startTime':'2018.07.02T03:45:20', 'endTime':'2018.07.03T02:50:20'}] df = spark.createDataFrame(d) Dates = namedtuple("Dates", "startTime endTime") def MergeAdjacentUsage(timeSets):

Spark - How to apply a udf over single field in a Seq[Map<String,String>]

阅读更多关于 Spark - How to apply a udf over single field in a Seq[Map]

问题 I have a Dataframe with two columns of types String and Seq[Map[String, String]]. Something like: Name Contact Alan [(Map(number -> 12345 , type -> home)), (Map(number -> 87878787 , type -> mobile))] Ben [(Map(number -> 94837593 , type -> job)),(Map(number -> 346 , type -> home))] So what I need is to apply a udf over the field number in each Map[String,String] o each element in the array. This udf will basically convert into 0000 any number which length is less than 6. Something like this: