psych::principal - explanation for the order and naming of rotated (principal) components

问题

Let x be a sample dataframe.

set.seed(0)
x <- replicate(4, rnorm(10))

A PCA using the principal function from the psych package will yield:

> principal(x, nf=4, rotate="none")
...
                       PC1  PC2  PC3  PC4
SS loadings           1.91 1.09 0.68 0.31
Proportion Var        0.48 0.27 0.17 0.08
Cumulative Var        0.48 0.75 0.92 1.00
Proportion Explained  0.48 0.27 0.17 0.08
Cumulative Proportion 0.48 0.75 0.92 1.00

Rotating te PCA solution using the varimax criterion yields new components now named RCi to indicate that the PCs have been rotated (hence, they are no PCs anymore).

> principal(x, nf=4, rotate="varimax")
...
                       RC4  RC3  RC2  RC1
SS loadings           1.03 1.02 1.00 0.95
Proportion Var        0.26 0.26 0.25 0.24
Cumulative Var        0.26 0.51 0.76 1.00
Proportion Explained  0.26 0.26 0.25 0.24
Cumulative Proportion 0.26 0.51 0.76 1.00

My question: Why is the order now RC4 to RC1 with the numbers decreasing from 4 to 1. The RCs are still ordered according to their share of SS. As the rotation is orthogonal I do not understand the point. What useful extra information does the order of the RC names convey? Or am I wrong to consider the order as arbitrary if the rotation is orthogonal?

Thanks!

回答1:

Mark, The logic is to recognize what rotation does. This is more for pedagogical reasons than anything else. I am trying to show the relationship of the original components to the rotated components. To take your example, look at the loadings, not just the variances accounted for.

unrotated:

    PC1   PC2   PC3   PC4 h2       u2
1 -0.77 -0.40  0.39  0.32  1 -6.7e-16
2  0.71 -0.28  0.63 -0.17  1  6.7e-16
3 -0.10  0.93  0.35  0.09  1  6.7e-16
4  0.90 -0.02 -0.13  0.42  1  2.2e-16

Rotated:
    RC4   RC3   RC2   RC1 h2       u2
1  0.95 -0.10 -0.08 -0.29  1 -6.7e-16
2 -0.10  0.97 -0.06  0.22  1  6.7e-16
3 -0.07 -0.06  0.99 -0.05  1  6.7e-16
4 -0.34  0.27 -0.07  0.90  1  2.2e-16

In particular, look at variables 3 and 4. In the unrotated solution, they define PC2 and PC1 respectively. Now look at the rotated solution. These two still mark PC2 and PC1 (and are labeled RC2 and RC1 to reflect that they are rotated), but the variances accounted for have changed as PC4 when rotated to RC4 now soaks up more variance. (This is also true for PC3 and PC4 but not as clear.)

What I am trying to do is represent what happens as you rotate. PC1 is rotated to a simpler structure, and becomes RC1.

Then, because many people like to have their components in order of variance accounted for, I sort by the eigen value (sum squares accounted for).

I believe what other programs do is to rotate and relabel so that the components are always called C1 ... Cn. I just like to see where the components came from.

If you think it is useful, I can (eventually) add this discussion to the documentation for principal as well as fa.

Bill

回答2:

Here's a partial answer. I looked through the code behind the principal function, and can see clearly where the reordering happens:

 if (nfactors > 1) {
    ev.rotated <- diag(t(loadings) %*% loadings)
    ev.order <- order(ev.rotated, decreasing = TRUE)
    loadings <- loadings[, ev.order]
  }

So the code above is the reason the order changes, but it's less clear what purpose that serves. I don't have enough experience in rotations to be able to discern the package author's intent.

来源：https://stackoverflow.com/questions/16896959/psychprincipal-explanation-for-the-order-and-naming-of-rotated-principal-c

标签

pca