How to handle null entries in SparkR

前端 未结 2 833
滥情空心
滥情空心 2020-12-20 17:05

I have a SparkSQL DataFrame.

Some entries in this data are empty but they don\'t behave like NULL or NA. How could I remove them? Any ideas?

In R I can easi

2条回答
  •  粉色の甜心
    2020-12-20 17:38

    It is not the nicest workaround, but if you cast them as strings, they are stored as "NaN" and then you can filter them, a short example:

    testFrame   <- createDataFrame(sqlContext, data.frame(a=c(1,2,3),b=c(1,NA,3)))
    testFrame$c <- cast(testFrame$b,"string")
    
    resultFrame <- collect(filter(testFrame, testFrame$c!="NaN"))
    resultFrame$c <- NULL
    

    This omits the entire row where the element in column b is missing.

提交回复
热议问题