dimensionality-reduction

Subset variables by significant P value

阅读更多关于 Subset variables by significant P value

问题 I'm trying to subset variables by significant P-values, and I attempted with the following code, but it only selects all variables instead of selecting by condition. Could anyone help me to correct the problem? myvars <- names(summary(backward_lm)$coefficients[,4] < 0.05) happiness_reduced <- happiness_nomis[myvars] Thanks! 回答1: An alternative solution to Martin's great answer (in the comments section) using the broom package. Unfortunately, you haven't posted an data, so I'm using the mtcars

t-SNE generates different results on different machines

阅读更多关于 t-SNE generates different results on different machines

问题 I have around 3000 datapoints in 100D that I project to 2D with t-SNE. Each datapoint belongs to one of three classes. However, when I run the script on two separate computers I keep getting inconsistent results. Some inconsistency is expected as I use a random seed, however one of the computers keeps getting better results (I use a macbook pro and a stationary machine on Ubuntu). I use the t-SNE implementation from Scikit-learn. The script and data is identical, I've manually copied the

Linear Discriminant Analysis inverse transform

阅读更多关于 Linear Discriminant Analysis inverse transform

来源： https://stackoverflow.com/questions/42957962/linear-discriminant-analysis-inverse-transform

Using TSNE to dimensionality reduction. Why 3 D graph is not working?

阅读更多关于 Using TSNE to dimensionality reduction. Why 3 D graph is not working?

问题 I have used the Digits dataset from Sklearn and I have tried to reduce the dimension from 64 to 3 using TSNE( t-Distributed Stochastic Neighbor Embedding): import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns #%matplotib inline from sklearn.manifold import TSNE from sklearn.datasets import load_digits from mpl_toolkits.mplot3d import Axes3D digits = load_digits() digits_df = pd.DataFrame(digits.data,) digits_df["target"] = pd.Series(digits.target) tsne

Using TSNE to dimensionality reduction. Why 3 D graph is not working?

阅读更多关于 Using TSNE to dimensionality reduction. Why 3 D graph is not working?

sklearn tsne with sparse matrix

阅读更多关于 sklearn tsne with sparse matrix

问题 I'm trying to display tsne on a very sparse matrix with precomputed distances values but I'm having trouble with it. It boils down to this: row = np.array([0, 2, 2, 0, 1, 2]) col = np.array([0, 0, 1, 2, 2, 2]) distances = np.array([.1, .2, .3, .4, .5, .6]) X = csc_matrix((distances, (row, col)), shape=(3, 3)) Y = TSNE(metric='precomputed').fit_transform(X) However, I get this error: TypeError: A sparse matrix was passed, but dense data is required for method="barnes_hut". Use X.toarray() to

sklearn tsne with sparse matrix

阅读更多关于 sklearn tsne with sparse matrix

LDA ignoring n_components?

阅读更多关于 LDA ignoring n_components?

问题 When I am trying to work with LDA from Scikit-Learn, it keeps only giving me one component, even though I am asking for more: >>> from sklearn.lda import LDA >>> x = np.random.randn(5,5) >>> y = [True, False, True, False, True] >>> for i in range(1,6): ... lda = LDA(n_components=i) ... model = lda.fit(x,y) ... model.transform(x) Gives /Users/orthogonal/virtualenvs/osxml/lib/python2.7/site-packages/sklearn/lda.py:161: UserWarning: Variables are collinear warnings.warn("Variables are collinear"

Plot PCA loadings and loading in biplot in sklearn (like R's autoplot)

阅读更多关于 Plot PCA loadings and loading in biplot in sklearn (like R's autoplot)

问题 I saw this tutorial in R w/ autoplot . They plotted the loadings and loading labels: autoplot(prcomp(df), data = iris, colour = 'Species', loadings = TRUE, loadings.colour = 'blue', loadings.label = TRUE, loadings.label.size = 3) https://cran.r-project.org/web/packages/ggfortify/vignettes/plot_pca.html I prefer Python 3 w/ matplotlib, scikit-learn, and pandas for my data analysis. However, I don't know how to add these on? How can you plot these vectors w/ matplotlib ? I've been reading

Plot PCA loadings and loading in biplot in sklearn (like R's autoplot)

阅读更多关于 Plot PCA loadings and loading in biplot in sklearn (like R's autoplot)