When choosing the number of principal components (k), we choose k to be the smallest value so that for example, 99% of variance, is retained.
However, in the Pytho
Although this question is older than 2 years i want to provide an update on this. I wanted to do the same and it looks like sklearn now provides this feature out of the box.
As stated in the docs
if 0 < n_components < 1 and svd_solver == ‘full’, select the number of components such that the amount of variance that needs to be explained is greater than the percentage specified by n_components
So the code required is now
my_model = PCA(n_components=0.99, svd_solver='full')
my_model.fit_transform(my_matrix)