How to convert a column that has been read as a string into a column of arrays? i.e. convert from below schema
scala> test.printSchema root |-- a: long (
In python (pyspark) it would be:
from pyspark.sql.types import * from pyspark.sql.functions import col, split test = test.withColumn( "b", split(col("b"), ",\s*").cast("array").alias("ev") )