user-defined-functions

Generate a Dataframe that follow a mathematical function for each column / row

南笙酒味 提交于 2021-02-19 08:24:22
问题 Is there a way to create/generate a Pandas DataFrame from scratch, such that each record follows a specific mathematical function? Background: In Financial Mathematics, very basic financial-derivatives (e.g. calls and puts) have closed-form pricing formulas (e.g. Black Scholes). These pricing formulas can be called stochastic functions (because they involve a random term) I'm trying to create a Monte Carlo simulation of a stock price (and subseuqently an option payoff and price based on the

How to register the Java SPark UDF in spark shell?

三世轮回 提交于 2021-02-19 07:35:34
问题 Below is my java udf code, package com.udf; import org.apache.spark.sql.api.java.UDF1; public class SparkUDF implements UDF1<String, String> { @Override public String call(String arg) throws Exception { if (validateString(arg)) return arg; return "INVALID"; } public static boolean validateString(String arg) { if (arg == null | arg.length() != 11) return false; else return true; } } I am building the Jar with this class as SparkUdf-1.0-SNAPSHOT.jar I am having a table name as sample in hive

How to register the Java SPark UDF in spark shell?

烈酒焚心 提交于 2021-02-19 07:35:10
问题 Below is my java udf code, package com.udf; import org.apache.spark.sql.api.java.UDF1; public class SparkUDF implements UDF1<String, String> { @Override public String call(String arg) throws Exception { if (validateString(arg)) return arg; return "INVALID"; } public static boolean validateString(String arg) { if (arg == null | arg.length() != 11) return false; else return true; } } I am building the Jar with this class as SparkUdf-1.0-SNAPSHOT.jar I am having a table name as sample in hive

Prevent system calls in Node.js when running untrusted user code

不问归期 提交于 2021-02-19 06:58:03
问题 I am currently considering issues of running user-supplied code in node. I have two issues: The user script must not read or write global state. For that, I assume I can simply spawn of a new process. Are there any other considerations? Do I have to hide the parent process from the child somehow, or is there no way a child can read, write or otherwise toy with the parent process? The user script must not do anything funky with the system. So, I am thinking of disallowing any system calls. How

PySpark UDF optimization challenge using a dictionary with regex's (Scala?)

ぐ巨炮叔叔 提交于 2021-02-18 17:09:50
问题 I am trying to optimize the code below (PySpark UDF). It gives me the desired result (based on my data set) but it's too slow on very large datasets (approx. 180M). The results (accuracy) are better than available Python modules (e.g. geotext, hdx-python-country). So I'm not looking for another module. DataFrame: df = spark.createDataFrame([ ["3030 Whispering Pines Circle, Prosper Texas, US","John"], ["Kalverstraat Amsterdam","Mary"], ["Kalverstraat Amsterdam, Netherlands","Lex"] ]).toDF(

R - Defining a function which recognises arguments not as objects, but as being part of the call

会有一股神秘感。 提交于 2021-02-11 16:46:16
问题 I'm trying to define a function which returns a graphical object in R. The idea is that I can then call this function with different arguments multiple times using an for loop or lapply function, then plotting the list of grobs in gridExtra::grid.arrange . However, I did not get that far yet. I'm having trouble with r recognising the arguments as being part of the call. I've made some code to show you my problem. I have tried quoting and unquoting the arguments, using unqoute() in the

Spark 3 Typed User Defined Aggregate Function over Window

*爱你&永不变心* 提交于 2021-02-11 15:12:56
问题 I am trying to use a custom user defined aggregator over a window. When I use an untyped aggregator, the query works. However, I am unable to use typed UDAF as a window function - I get an error stating The query operator ``Project`` contains one or more unsupported expression types Aggregate, Window or Generate . The following basic program showcases the problem. I think it could work using UserDefinedAggregateFunction rather then Aggregator , but the former is deprecated. import scala

Working with a StructType column in PySpark UDF

老子叫甜甜 提交于 2021-02-11 15:02:32
问题 I have the following schema for one of columns that I'm processing, |-- time_to_resolution_remainingTime: struct (nullable = true) | |-- _links: struct (nullable = true) | | |-- self: string (nullable = true) | |-- completedCycles: array (nullable = true) | | |-- element: struct (containsNull = true) | | | |-- breached: boolean (nullable = true) | | | |-- elapsedTime: struct (nullable = true) | | | | |-- friendly: string (nullable = true) | | | | |-- millis: long (nullable = true) | | | |--

Count SQL Server User Created Functions based on Type

安稳与你 提交于 2021-02-11 13:55:49
问题 I can count total user created functions in SQL Server using SELECT COUNT(*) FROM information_schema.routines WHERE routine_type = 'FUNCTION' But this returns all functions whether it be a scalar-valued function, an inline function or a table-valued function Is there a way to obtain a count specific to the type of function? E.g. count inline functions only? 回答1: This distinction you are after is specific to SQL Server and probably not covered by the information_schema standard. You need to

Excel VBA User-Defined Function to query an Access Database

萝らか妹 提交于 2021-02-11 12:48:57
问题 I have an Access 365 database that has Invoice Numbers, Due Dates, and Amounts Due. I'm trying to create an Excel UDF, whereby I input the Due Date and Invoice Number, and the function queries the database and returns the Amount Due. The formula result is #Value and there's no compiler error, though there appears to be an error when it attempts to open the record set (I set up a error message box for this action). Perhaps there's an issue with my SQL? I'd appreciate any assistance with this