I am trying improve the accuracy of Logistic regression algorithm implemented in Spark using Java. For this I\'m trying to replace Null or invalid values present in a column
In order to replace the NULL values with a given string I've used fill function present in Spark for Java. It accepts the word to be replaced with and a sequence of column names. Here is how I have implemented that:-
List colList = new ArrayList();
colList.add(cols[i]);
Seq colSeq = scala.collection.JavaConverters.asScalaIteratorConverter(colList.iterator()).asScala().toSeq();
data=data.na().fill(word, colSeq);