class sklearn.decomposition.PCA(n_components=None, copy=True, whiten=False, svd_solver=’auto’, tol=0.0, iterated_power=’auto’, random_state=None)
[source]
Principal component analysis (PCA)
Linear dimensionality reduction using Singular Value Decomposition of the data to project it to a lower dimensional space.
It uses the LAPACK implementation of the full SVD or a randomized truncated SVD by the method of Halko et al. 2009, depending on the shape of the input data and the number of components to extract.
It can also use the scipy.sparse.linalg ARPACK implementation of the truncated SVD.
Notice that this class does not support sparse input. See TruncatedSVD
for an alternative with sparse data.
Read more in the User Guide.
Parameters: |
|
---|---|
Attributes: |
|
See also
For n_components == ‘mle’, this class uses the method of Minka, T. P. “Automatic choice of dimensionality for PCA”. In NIPS, pp. 598-604
Implements the probabilistic PCA model from: `Tipping, M. E., and Bishop, C. M. (1999). “Probabilistic principal component analysis”. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 61(3), 611-622. via the score and score_samples methods. See http://www.miketipping.com/papers/met-mppca.pdf
For svd_solver == ‘arpack’, refer to scipy.sparse.linalg.svds
.
For svd_solver == ‘randomized’, see: Halko, N., Martinsson, P. G., and Tropp, J. A. (2011). “Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions”. SIAM review, 53(2), 217-288.
and also Martinsson, P. G., Rokhlin, V., and Tygert, M. (2011). “A randomized algorithm for the decomposition of matrices”. Applied and Computational Harmonic Analysis, 30(1), 47-68.
>>> import numpy as np >>> from sklearn.decomposition import PCA >>> X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]]) >>> pca = PCA(n_components=2) >>> pca.fit(X) PCA(copy=True, iterated_power='auto', n_components=2, random_state=None, svd_solver='auto', tol=0.0, whiten=False) >>> print(pca.explained_variance_ratio_) [0.9924... 0.0075...] >>> print(pca.singular_values_) [6.30061... 0.54980...]
>>> pca = PCA(n_components=2, svd_solver='full') >>> pca.fit(X) PCA(copy=True, iterated_power='auto', n_components=2, random_state=None, svd_solver='full', tol=0.0, whiten=False) >>> print(pca.explained_variance_ratio_) [0.9924... 0.00755...] >>> print(pca.singular_values_) [6.30061... 0.54980...]
>>> pca = PCA(n_components=1, svd_solver='arpack') >>> pca.fit(X) PCA(copy=True, iterated_power='auto', n_components=1, random_state=None, svd_solver='arpack', tol=0.0, whiten=False) >>> print(pca.explained_variance_ratio_) [0.99244...] >>> print(pca.singular_values_) [6.30061...]
fit (X[, y]) | Fit the model with X. |
fit_transform (X[, y]) | Fit the model with X and apply the dimensionality reduction on X. |
get_covariance () | Compute data covariance with the generative model. |
get_params ([deep]) | Get parameters for this estimator. |
get_precision () | Compute data precision matrix with the generative model. |
inverse_transform (X) | Transform data back to its original space. |
score (X[, y]) | Return the average log-likelihood of all samples. |
score_samples (X) | Return the log-likelihood of each sample. |
set_params (**params) | Set the parameters of this estimator. |
transform (X) | Apply dimensionality reduction to X. |
__init__(n_components=None, copy=True, whiten=False, svd_solver=’auto’, tol=0.0, iterated_power=’auto’, random_state=None)
[source]
fit(X, y=None)
[source]
Fit the model with X.
Parameters: |
|
---|---|
Returns: |
|
fit_transform(X, y=None)
[source]
Fit the model with X and apply the dimensionality reduction on X.
Parameters: |
|
---|---|
Returns: |
|
get_covariance()
[source]
Compute data covariance with the generative model.
cov = components_.T * S**2 * components_ + sigma2 * eye(n_features)
where S**2 contains the explained variances, and sigma2 contains the noise variances.
Returns: |
|
---|
get_params(deep=True)
[source]
Get parameters for this estimator.
Parameters: |
|
---|---|
Returns: |
|
get_precision()
[source]
Compute data precision matrix with the generative model.
Equals the inverse of the covariance but computed with the matrix inversion lemma for efficiency.
Returns: |
|
---|
inverse_transform(X)
[source]
Transform data back to its original space.
In other words, return an input X_original whose transform would be X.
Parameters: |
|
---|---|
Returns: |
|
If whitening is enabled, inverse_transform will compute the exact inverse operation, which includes reversing whitening.
score(X, y=None)
[source]
Return the average log-likelihood of all samples.
See. “Pattern Recognition and Machine Learning” by C. Bishop, 12.2.1 p. 574 or http://www.miketipping.com/papers/met-mppca.pdf
Parameters: |
|
---|---|
Returns: |
|
score_samples(X)
[source]
Return the log-likelihood of each sample.
See. “Pattern Recognition and Machine Learning” by C. Bishop, 12.2.1 p. 574 or http://www.miketipping.com/papers/met-mppca.pdf
Parameters: |
|
---|---|
Returns: |
|
set_params(**params)
[source]
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter>
so that it’s possible to update each component of a nested object.
Returns: |
|
---|
transform(X)
[source]
Apply dimensionality reduction to X.
X is projected on the first principal components previously extracted from a training set.
Parameters: |
|
---|---|
Returns: |
|
>>> import numpy as np >>> from sklearn.decomposition import IncrementalPCA >>> X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]]) >>> ipca = IncrementalPCA(n_components=2, batch_size=3) >>> ipca.fit(X) IncrementalPCA(batch_size=3, copy=True, n_components=2, whiten=False) >>> ipca.transform(X)
sklearn.decomposition.PCA
© 2007–2018 The scikit-learn developers
Licensed under the 3-clause BSD License.
http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html