I am trying to run a Multinomial Logistic Regression model
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName(\'prepare_data\').getOrC
You need to make sure there are no missing values in your data -- that's why you get the NullPointerException. Also, make sure that all your input features to the VectorAssembler are numeric.
BTW, when you create encoder you might consider specifying the inputCol as StringIndexer.getOuputCol().