I am trying to filter an RDD based like below:
spark_df = sc.createDataFrame(pandas_df) spark_df.filter(lambda r: str(
convert the dataframe into rdd.
spark_df = sc.createDataFrame(pandas_df) spark_df.rdd.filter(lambda r: str(r['target']).startswith('good')) spark_df.take(5)
I think it may work!