I have to get the datatype and do a case match and convert it to some required format. But the usage of org.apache.spark.ml.linalg.VectorUDT is showing VectorUDT is private. Also I specifically need to use org.apache.spark.ml.linalg.VectorUDT and not org.apache.spark.mllib.linalg.VectorUDT. Can someone suggest how to go about this?
For org.apache.spark.ml.linalg types you should specify schema using org.apache.spark.ml.linalg.SQLDataTypes which provide singleton instances of the private UDT types:
MatrixTypefor matrices (org.apache.spark.ml.linalg.Matrix).scala> org.apache.spark.ml.linalg.SQLDataTypes.MatrixType.getClass res0: Class[_ <: org.apache.spark.sql.types.DataType] = class org.apache.spark.ml.linalg.MatrixUDTVectorTypefor vectors (org.apache.spark.ml.linalg.Vector).scala> org.apache.spark.ml.linalg.SQLDataTypes.VectorType.getClass res1: Class[_ <: org.apache.spark.sql.types.DataType] = class org.apache.spark.ml.linalg.VectorUDT
来源:https://stackoverflow.com/questions/45809316/vectorudt-usage