I am trying to run svm on very very large dataset, which I am unable to run using sklearn. It take endless time with sklearn. So I decided to use pyspark Here are my spark c