I\'m trying to create a schema for my new DataFrame and have tried various combinations of brackets and keywords but have been unable to figure out how to make this work. My cu
You will need an additional StructField for ArrayType property. This one should work:
from pyspark.sql.types import *
schema = StructType([
StructField("User", IntegerType()),
StructField("My_array", ArrayType(
StructType([
StructField("user", StringType()),
StructField("product", StringType()),
StructField("rating", DoubleType())
])
)
])
For more information check this link: http://nadbordrozd.github.io/blog/2016/05/22/one-weird-trick-that-will-fix-your-pyspark-schemas/