What Type should the dense vector be, when using UDF function in Pyspark? [duplicate]
问题 This question already has an answer here : How to convert ArrayType to DenseVector in PySpark DataFrame? (1 answer) Closed last year . I want to change List to Vector in pySpark, and then use this column to Machine Learning model for training. But my spark version is 1.6.0, which does not have VectorUDT() . So what type should I return in my udf function? from pyspark.sql import SQLContext from pyspark import SparkContext, SparkConf from pyspark.sql.functions import * from pyspark.mllib