I am trying to make a histogram with a column from a dataframe which looks like
DataFrame[C0: int, C1: int, ...]
If I were to make a histog
Let's say your values in C1 are between 1-1000 and you want to get a histogram of 10 bins. You can do something like: df.withColumn("bins", df.C1/100).groupBy("bins").count() If your binning is more complex you can make a UDF for it (and at worse, you might need to analyze the column first, e.g. by using describe or through some other method).