How to replace null values with a specific value in Dataframe using spark in Java?

后端 未结 4 663
抹茶落季
抹茶落季 2020-12-05 14:00

I am trying improve the accuracy of Logistic regression algorithm implemented in Spark using Java. For this I\'m trying to replace Null or invalid values present in a column

4条回答
  •  忘掉有多难
    2020-12-05 14:35

    You can use .na.fill function (it is a function in org.apache.spark.sql.DataFrameNaFunctions).

    Basically the function you need is: def fill(value: String, cols: Seq[String]): DataFrame

    You can choose the columns, and you choose the value you want to replace the null or NaN.

    In your case it will be something like:

    val df2 = df.na.fill("a", Seq("Name"))
                .na.fill("a2", Seq("Place"))
    

提交回复
热议问题