I have a pyspark data frame whih has a column containing strings. I want to split this column into words
Code:
>>> sentenceData = sqlContext
Use split function:
split
from pyspark.sql.functions import split df.withColumn("desc", split("desc", "\s+"))