How to transform structured streams with PySpark?

后端 未结 2 1280
离开以前
离开以前 2020-12-10 18:46

This seems like it should be obvious, but in reviewing the docs and examples, I\'m not sure I can find a way to take a structured stream and transform using PySpark.

2条回答
  •  旧时难觅i
    2020-12-10 19:19

    Another way for a specific column (column_name):

    from pyspark.sql.functions import udf
    from pyspark.sql.types import StringType
    
    def to_uper(string):
        return string.upper()
    
    to_upper_udf = udf(to_upper,StringType())
    
    records = raw_records.withColumn("new_column_name"
                          ,to_upper_udf(raw_records['column_name']))\
                          .drop("column_name")
    
    

提交回复
热议问题