PCA Implementation in Java [closed]

人走茶凉 提交于 2020-01-22 09:53:48

问题


I need implementation of PCA in Java. I am interested in finding something that's well documented, practical and easy to use. Any recommendations?


回答1:


There are now a number of Principal Component Analysis implementations for Java.

  1. Apache Spark: https://spark.apache.org/docs/2.1.0/mllib-dimensionality-reduction.html#principal-component-analysis-pca

    SparkConf conf = new SparkConf().setAppName("PCAExample").setMaster("local");
    try (JavaSparkContext sc = new JavaSparkContext(conf)) {
        //Create points as Spark Vectors
        List<Vector> vectors = Arrays.asList(
                Vectors.dense( -1.0, -1.0 ),
                Vectors.dense( -1.0, 1.0 ),
                Vectors.dense( 1.0, 1.0 ));
    
        //Create Spark MLLib RDD
        JavaRDD<Vector> distData = sc.parallelize(vectors);
        RDD<Vector> vectorRDD = distData.rdd();
    
        //Execute PCA Projection to 2 dimensions
        PCA pca = new PCA(2); 
        PCAModel pcaModel = pca.fit(vectorRDD);
        Matrix matrix = pcaModel.pc();
    }
    
  2. ND4J: http://nd4j.org/doc/org/nd4j/linalg/dimensionalityreduction/PCA.html

    //Create points as NDArray instances
    List<INDArray> ndArrays = Arrays.asList(
            new NDArray(new float [] {-1.0F, -1.0F}),
            new NDArray(new float [] {-1.0F, 1.0F}),
            new NDArray(new float [] {1.0F, 1.0F}));
    
    //Create matrix of points (rows are observations; columns are features)
    INDArray matrix = new NDArray(ndArrays, new int [] {3,2});
    
    //Execute PCA - again to 2 dimensions
    INDArray factors = PCA.pca_factor(matrix, 2, false);
    
  3. Apache Commons Math (single threaded; no framework)

    //create points in a double array
    double[][] pointsArray = new double[][] { 
        new double[] { -1.0, -1.0 }, 
        new double[] { -1.0, 1.0 },
        new double[] { 1.0, 1.0 } };
    
    //create real matrix
    RealMatrix realMatrix = MatrixUtils.createRealMatrix(pointsArray);
    
    //create covariance matrix of points, then find eigen vectors
    //see https://stats.stackexchange.com/questions/2691/making-sense-of-principal-component-analysis-eigenvectors-eigenvalues
    
    Covariance covariance = new Covariance(realMatrix);
    RealMatrix covarianceMatrix = covariance.getCovarianceMatrix();
    EigenDecomposition ed = new EigenDecomposition(covarianceMatrix);
    

Note, Singular Value Decomposition, which can also be used to find Principal Components, has equivalent implementations.




回答2:


Here is one: PCA Class.

This class contains the methods necessary for a basic Principal Component Analysis with a varimax rotation. Options are available for an analysis using either the covariance or the correlation martix. A parallel analysis, using Monte Carlo simulations, is performed. Extraction criteria based on eigenvalues greater than unity, greater than a Monte Carlo eigenvalue percentile or greater than the Monte Carlo eigenvalue means are available.




回答3:


check http://weka.sourceforge.net/doc.stable/weka/attributeSelection/PrincipalComponents.html weka in fact have many other algorithm that could be used with along with PCA and also weka is adding more algorithm from time to time. so i thing, if you are working on java then switch to weka api.




回答4:


Smile is a full-fledged ML library for java. You give its PCA implementation a try. Please see: https://haifengl.github.io/smile/api/java/smile/projection/PCA.html

There is also PCA tutorial with Smile but the tutorial uses Scala.




回答5:


You can see a few implementations of PCA in the DataMelt project:

https://jwork.org/dmelt/code/index.php?keyword=PCA

(they are rewritten in Jython). They include some graphical examples for dimensionality reduction. They show the usage of several Java packages, such as JSAT, DatumBox and others.



来源:https://stackoverflow.com/questions/10604507/pca-implementation-in-java

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!