This seems like it should be obvious, but in reviewing the docs and examples, I\'m not sure I can find a way to take a structured stream and transform using PySpark.
Another way for a specific column (column_name):
from pyspark.sql.functions import udf
from pyspark.sql.types import StringType
def to_uper(string):
return string.upper()
to_upper_udf = udf(to_upper,StringType())
records = raw_records.withColumn("new_column_name"
,to_upper_udf(raw_records['column_name']))\
.drop("column_name")