Generate isotropic Gaussian blobs for clustering.
Read more in the User Guide.
If int, it is the total number of points equally divided among clusters. If array-like, each element of the sequence indicates the number of samples per cluster.
Changed in version v0.20: one can now pass an array-like to the n_samples parameter
The number of features for each sample.
The number of centers to generate, or the fixed center locations. If n_samples is an int and centers is None, 3 centers are generated. If n_samples is array-like, centers must be either None or an array of length equal to the length of n_samples.
The standard deviation of the clusters.
The bounding box for each cluster center when centers are generated at random.
Shuffle the samples.
Determines random number generation for dataset creation. Pass an int for reproducible output across multiple function calls. See Glossary.
If True, then return the centers of each cluster.
Added in version 0.23.
The generated samples.
The integer labels for cluster membership of each sample.
The centers of each cluster. Only returned if return_centers=True.
See also
make_classificationA more intricate variant.
>>> from sklearn.datasets import make_blobs >>> X, y = make_blobs(n_samples=10, centers=3, n_features=2, ... random_state=0) >>> print(X.shape) (10, 2) >>> y array([0, 0, 1, 0, 2, 2, 2, 1, 1, 0]) >>> X, y = make_blobs(n_samples=[3, 3, 4], centers=None, n_features=2, ... random_state=0) >>> print(X.shape) (10, 2) >>> y array([0, 1, 2, 0, 2, 2, 2, 1, 1, 0])
© 2007–2025 The scikit-learn developers
Licensed under the 3-clause BSD License.
https://scikit-learn.org/1.6/modules/generated/sklearn.datasets.make_blobs.html