How to use countDistinct in Scala with Spark?
问题 I've tried to use countDistinct function which should be available in Spark 1.5 according to DataBrick's blog. However, I got the following exception: Exception in thread "main" org.apache.spark.sql.AnalysisException: undefined function countDistinct; I've found that on Spark developers' mail list they suggest using count and distinct functions to get the same result which should be produced by countDistinct : count(distinct <columnName>) // Instead countDistinct(<columnName>) Because I build