Compute incremental mean and variance along an axis on a CSR or CSC matrix.
last_mean, last_var are the statistics computed at the last step by this function. Both must be initialized to 0-arrays of the proper size, i.e. the number of features in X. last_n is the number of samples encountered until now.
Input data.
Axis along which the axis should be computed.
Array of means to update with the new data X. Should be of shape (n_features,) if axis=0 or (n_samples,) if axis=1.
Array of variances to update with the new data X. Should be of shape (n_features,) if axis=0 or (n_samples,) if axis=1.
Sum of the weights seen so far, excluding the current weights If not float, it should be of shape (n_features,) if axis=0 or (n_samples,) if axis=1. If float it corresponds to having same weights for all samples (or features).
If axis is set to 0 shape is (n_samples,) or if axis is set to 1 shape is (n_features,). If it is set to None, then samples are equally weighted.
Added in version 0.24.
Updated feature-wise means if axis = 0 or sample-wise means if axis = 1.
Updated feature-wise variances if axis = 0 or sample-wise variances if axis = 1.
Updated number of seen samples per feature if axis=0 or number of seen features per sample if axis=1.
If weights is not None, n is a sum of the weights of the seen samples or features instead of the actual number of seen samples or features.
NaNs are ignored in the algorithm.
>>> from sklearn.utils import sparsefuncs
>>> from scipy import sparse
>>> import numpy as np
>>> indptr = np.array([0, 3, 4, 4, 4])
>>> indices = np.array([0, 1, 2, 2])
>>> data = np.array([8, 1, 2, 5])
>>> scale = np.array([2, 3, 2])
>>> csr = sparse.csr_matrix((data, indices, indptr))
>>> csr.todense()
matrix([[8, 1, 2],
[0, 0, 5],
[0, 0, 0],
[0, 0, 0]])
>>> sparsefuncs.incr_mean_variance_axis(
... csr, axis=0, last_mean=np.zeros(3), last_var=np.zeros(3), last_n=2
... )
(array([1.3..., 0.1..., 1.1...]), array([8.8..., 0.1..., 3.4...]),
array([6., 6., 6.]))
© 2007–2025 The scikit-learn developers
Licensed under the 3-clause BSD License.
https://scikit-learn.org/1.6/modules/generated/sklearn.utils.sparsefuncs.incr_mean_variance_axis.html