Filter spark/scala dataframe if column is present in set

前端 未结 2 1001
耶瑟儿~
耶瑟儿~ 2021-01-14 02:55

I\'m using Spark 1.4.0, this is what I have so far:

data.filter($\"myColumn\".in(lit(\"A\"), lit(\"B\"), lit(\"C\"), ...))

The function lit

2条回答
  •  佛祖请我去吃肉
    2021-01-14 03:12

    This PR has been merged into Spark 2.4. You can now do

    val profileDF = Seq(
      Some(1), Some(2), Some(3), Some(4),
      Some(5), Some(6), Some(7), None
    ).toDF("profileID")
    
    val validUsers: Set[Any] = Set(6, 7.toShort, 8L, "3")
    
    val result = profileDF.withColumn("isValid", $"profileID".isInCollection(validUsers))
    
    result.show(10)
    """
    +---------+-------+
    |profileID|isValid|
    +---------+-------+
    |        1|  false|
    |        2|  false|
    |        3|   true|
    |        4|  false|
    |        5|  false|
    |        6|   true|
    |        7|   true|
    |     null|   null|
    +---------+-------+
     """.stripMargin
    

提交回复
热议问题