Java Spark Multiply all rows for column

这一生的挚爱 提交于 2019-12-23 04:53:11

问题


For this given input data;

|BASE_CAP_RET|BASE_INC_RET|BASE_TOT_RET|acct_cd|eff_date|id | +------------+------------+------------+----------+-------------------+--------+ |0.1 |0.2 |0.1 |acc1|2004-01-01T00:00:00|10018069| |0.2 |0.2 |0.1|acc1|2004-01-01T00:00:00|10018069| |0.3 |0.2 |0.1 |acc1|2004-01-02T00:00:00|10018069|

How do i multiply all rows for the column BASE_CAP_RET, BASE_INC_RET and BASE_TOT_RET?

|BASE_CAP_RET|BASE_INC_RET|BASE_TOT_RET|acct_cd|eff_date|id | +------------+------------+------------+----------+------------ |0.6 |0.8 |0.1 |acc1|2004-01-01T00:00:00|10018069|

i am able to do this in scala using -

scala> var sec_returns = returns.withColumn("returns",explode($"returns")).select("returns.*")

scala> import org.apache.spark.sql.functions._ import org.apache.spark.sql.functions._

scala> val prod = udf((vals:Seq[Double]) => vals.reduce(_*_))

scala> var ret_columns = sec_returns.columns.filter(col => col.contains("RET")).map(p => prod(collect_list(p)))

scala> sec_returns.groupBy(col("acct_cd"),col("sec_id")).agg(ret_columns.head, ret_columns.tail: _*).show()

how do i do it in java?

来源:https://stackoverflow.com/questions/58632733/java-spark-multiply-all-rows-for-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!