sklearn.utils.extmath.randomized_svd
-
sklearn.utils.extmath.randomized_svd(M, n_components, n_oversamples=10, n_iter=’auto’, power_iteration_normalizer=’auto’, transpose=’auto’, flip_sign=True, random_state=0)
[source]
-
Computes a truncated randomized SVD
Parameters: |
-
M : ndarray or sparse matrix -
Matrix to decompose -
n_components : int -
Number of singular values and vectors to extract. -
n_oversamples : int (default is 10) -
Additional number of random vectors to sample the range of M so as to ensure proper conditioning. The total number of random vectors used to find the range of M is n_components + n_oversamples. Smaller number can improve speed but can negatively impact the quality of approximation of singular vectors and singular values. -
n_iter : int or ‘auto’ (default is ‘auto’) -
Number of power iterations. It can be used to deal with very noisy problems. When ‘auto’, it is set to 4, unless n_components is small (< .1 * min(X.shape)) n_iter in which case is set to 7. This improves precision with few components. -
power_iteration_normalizer : ‘auto’ (default), ‘QR’, ‘LU’, ‘none’ -
Whether the power iterations are normalized with step-by-step QR factorization (the slowest but most accurate), ‘none’ (the fastest but numerically unstable when n_iter is large, e.g. typically 5 or larger), or ‘LU’ factorization (numerically stable but can lose slightly in accuracy). The ‘auto’ mode applies no normalization if n_iter <= 2 and switches to LU otherwise. -
transpose : True, False or ‘auto’ (default) -
Whether the algorithm should be applied to M.T instead of M. The result should approximately be the same. The ‘auto’ mode will trigger the transposition if M.shape[1] > M.shape[0] since this implementation of randomized SVD tend to be a little faster in that case. -
flip_sign : boolean, (True by default) -
The output of a singular value decomposition is only unique up to a permutation of the signs of the singular vectors. If flip_sign is set to True , the sign ambiguity is resolved by making the largest loadings for each component in the left singular vectors positive. -
random_state : int, RandomState instance or None, optional (default=None) -
The seed of the pseudo random number generator to use when shuffling the data. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random . |
Notes
This algorithm finds a (usually very good) approximate truncated singular value decomposition using randomization to speed up the computations. It is particularly fast on large matrices on which you wish to extract only a small number of components. In order to obtain further speed up, n_iter
can be set <=2 (at the cost of loss of precision).
References
- Finding structure with randomness: Stochastic algorithms for constructing approximate matrix decompositions Halko, et al., 2009 http://arxiv.org/abs/arXiv:0909.4061
- A randomized algorithm for the decomposition of matrices Per-Gunnar Martinsson, Vladimir Rokhlin and Mark Tygert
- An implementation of a randomized algorithm for principal component analysis A. Szlam et al. 2014