问题
How to create a new column in PySpark and fill this column with the date of today?
This is what I tried:
import datetime
now = datetime.datetime.now()
df = df.withColumn("date", str(now)[:10])
I get this error:
AssertionError: col should be Column
回答1:
How to create a new column in PySpark and fill this column with the date of today?
There is already function for that:
from pyspark.sql.functions import current_date
df.withColumn("date", current_date().cast("string"))
AssertionError: col should be Column
Use literal
from pyspark.sql.functions import lit
df.withColumn("date", lit(str(now)[:10]))
来源:https://stackoverflow.com/questions/47903905/assertionerror-col-should-be-column