Generate the “Friedman #2” regression problem.
This dataset is described in Friedman [1] and Breiman [2].
Inputs X are 4 independent features uniformly distributed on the intervals:
0 <= X[:, 0] <= 100, 40 * pi <= X[:, 1] <= 560 * pi, 0 <= X[:, 2] <= 1, 1 <= X[:, 3] <= 11.
The output y is created according to the formula:
y(X) = (X[:, 0] ** 2 + (X[:, 1] * X[:, 2] - 1 / (X[:, 1] * X[:, 3])) ** 2) ** 0.5 + noise * N(0, 1).
Read more in the User Guide.
The number of samples.
The standard deviation of the gaussian noise applied to the output.
Determines random number generation for dataset noise. Pass an int for reproducible output across multiple function calls. See Glossary.
The input samples.
The output values.
J. Friedman, “Multivariate adaptive regression splines”, The Annals of Statistics 19 (1), pages 1-67, 1991.
L. Breiman, “Bagging predictors”, Machine Learning 24, pages 123-140, 1996.
>>> from sklearn.datasets import make_friedman2 >>> X, y = make_friedman2(random_state=42) >>> X.shape (100, 4) >>> y.shape (100,) >>> list(y[:3]) [np.float64(1229.4...), np.float64(27.0...), np.float64(65.6...)]
© 2007–2025 The scikit-learn developers
Licensed under the 3-clause BSD License.
https://scikit-learn.org/1.6/modules/generated/sklearn.datasets.make_friedman2.html