I am reducing the dimensionality of a Spark DataFrame with PCA model with pyspark (using the spark ml library) as follows
In spark 2.2+ you can now easily get the explained variance as:
from pyspark.ml.feature import VectorAssembler
assembler = VectorAssembler(inputCols=, outputCol="features")
df = assembler.transform().select("features")
from pyspark.ml.feature import PCA
pca = PCA(k=10, inputCol="features", outputCol="pcaFeatures")
model = pca.fit(df)
sum(model.explainedVariance)