Error: AttributeError: 'DataFrame' object has no attribute '_jdf'

前端 未结 2 1951
余生分开走
余生分开走 2021-02-20 02:08

I want to perform k-fold cross validation using pyspark to finetune the parameters and I\'m using pyspark.ml. I am getting Attribute Error.

AttributeError: \'DataFrame\'

2条回答
  •  太阳男子
    2021-02-20 02:38

    If a metric evaluation error you probably:

    1. Transformed using Spark on test set properly, then peeked using Pandas DF.
    # Spark model, transformed test, converted to pandas df
    predictions = model.transform(test)
    predDF = predictions.toPandas()
    predDF.head()
    
    1. Then tried:
    eval_acc = MulticlassClassificationEvaluator(
                labelCol='Label_index',
                predictionCol='prediction',
                metricName='accuracy'
    )
    
    # Evaluate Performance
    acc = eval_acc.evaluate(predDF) # Error
    print(f"accuracy: {acc}")
    

    I forgot predDF is a Pandas DataFrame. Needed predictions because its a Spark Dataframe.

    acc = eval_acc.evaluate(predictions) # Works
    print(f"accuracy: {acc}")
    

提交回复
热议问题