Generate data for binary classification used in Hastie et al. 2009, Example 10.2.
The ten features are standard independent Gaussian and the target y is defined by:
y[i] = 1 if np.sum(X[i] ** 2) > 9.34 else -1
Read more in the User Guide.
The number of samples.
Determines random number generation for dataset creation. Pass an int for reproducible output across multiple function calls. See Glossary.
The input samples.
The output values.
See also
make_gaussian_quantilesA generalization of this dataset approach.
T. Hastie, R. Tibshirani and J. Friedman, “Elements of Statistical Learning Ed. 2”, Springer, 2009.
>>> from sklearn.datasets import make_hastie_10_2 >>> X, y = make_hastie_10_2(n_samples=24000, random_state=42) >>> X.shape (24000, 10) >>> y.shape (24000,) >>> list(y[:5]) [np.float64(-1.0), np.float64(1.0), np.float64(-1.0), np.float64(1.0), np.float64(-1.0)]
© 2007–2025 The scikit-learn developers
Licensed under the 3-clause BSD License.
https://scikit-learn.org/1.6/modules/generated/sklearn.datasets.make_hastie_10_2.html