Generate the “Friedman #1” regression problem.
This dataset is described in Friedman [1] and Breiman [2].
Inputs X are independent features uniformly distributed on the interval [0, 1]. The output y is created according to the formula:
y(X) = 10 * sin(pi * X[:, 0] * X[:, 1]) + 20 * (X[:, 2] - 0.5) ** 2 + 10 * X[:, 3] + 5 * X[:, 4] + noise * N(0, 1).
Out of the n_features features, only 5 are actually used to compute y. The remaining features are independent of y.
The number of features has to be >= 5.
Read more in the User Guide.
The number of samples.
The number of features. Should be at least 5.
The standard deviation of the gaussian noise applied to the output.
Determines random number generation for dataset noise. Pass an int for reproducible output across multiple function calls. See Glossary.
The input samples.
The output values.
J. Friedman, “Multivariate adaptive regression splines”, The Annals of Statistics 19 (1), pages 1-67, 1991.
L. Breiman, “Bagging predictors”, Machine Learning 24, pages 123-140, 1996.
>>> from sklearn.datasets import make_friedman1 >>> X, y = make_friedman1(random_state=42) >>> X.shape (100, 10) >>> y.shape (100,) >>> list(y[:3]) [np.float64(16.8...), np.float64(5.8...), np.float64(9.4...)]
© 2007–2025 The scikit-learn developers
Licensed under the 3-clause BSD License.
https://scikit-learn.org/1.6/modules/generated/sklearn.datasets.make_friedman1.html