1.空值替换为其他值
建df时的空值表示形式为:null
null
val df = Seq("a", null, "c", "b").toDF("col1")
df.show()
var df4 = df.na.fill(value="qqq",Array[String]("col1"))
df4.show()
df: org.apache.spark.sql.DataFrame = [col1: string]
+----+
|col1|
+----+
| a|
|null|
| c|
| b|
+----+
df4: org.apache.spark.sql.DataFrame = [col1: string]
+----+
|col1|
+----+
| a|
| qqq|
| c|
| b|
+----+
2.其他值转换为空值
此时的空值形式为
“null”
val df2 = df.withColumn("col1", regexp_replace(col("col1"), "NullNone", "null"))
df2.show()
df2: org.apache.spark.sql.DataFrame = [col1: string]
+----+
|col1|
+----+
| a|
|null|
| c|
| b|
+----+
val df3 = df2.na.fill(value="qqq",Array[String]("col1"))
df3.show()
df3: org.apache.spark.sql.DataFrame = [col1: string]
+----+
|col1|
+----+
| a|
| qqq|
| c|
| b|
+----+
来源:CSDN
作者:楓尘林间
链接:https://blog.csdn.net/bowenlaw/article/details/104431704