Validation curve.
Determine training and test scores for varying parameter values.
Compute scores for an estimator with different values of a specified parameter. This is similar to grid search with one parameter. However, this will also compute training scores and is merely a utility for plotting the results.
Read more in the User Guide.
An object of that type which is cloned for each validation. It must also implement “predict” unless scoring is a callable that doesn’t rely on “predict” to compute a score.
Training vector, where n_samples is the number of samples and n_features is the number of features.
Target relative to X for classification or regression; None for unsupervised learning.
Name of the parameter that will be varied.
The values of the parameter that will be evaluated.
Group labels for the samples used while splitting the dataset into train/test set. Only used in conjunction with a “Group” cv instance (e.g., GroupKFold).
Changed in version 1.6: groups can only be passed if metadata routing is not enabled via sklearn.set_config(enable_metadata_routing=True). When routing is enabled, pass groups alongside other metadata via the params argument instead. E.g.: validation_curve(..., params={'groups': groups}).
Determines the cross-validation splitting strategy. Possible inputs for cv are:
(Stratified)KFold,For int/None inputs, if the estimator is a classifier and y is either binary or multiclass, StratifiedKFold is used. In all other cases, KFold is used. These splitters are instantiated with shuffle=False so the splits will be the same across calls.
Refer User Guide for the various cross-validation strategies that can be used here.
Changed in version 0.22: cv default value if None changed from 3-fold to 5-fold.
A str (see The scoring parameter: defining model evaluation rules) or a scorer callable object / function with signature scorer(estimator, X, y).
Number of jobs to run in parallel. Training the estimator and computing the score are parallelized over the combinations of each parameter value and each cross-validation split. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors. See Glossary for more details.
Number of predispatched jobs for parallel execution (default is all). The option can reduce the allocated memory. The str can be an expression like ‘2*n_jobs’.
Controls the verbosity: the higher, the more messages.
Value to assign to the score if an error occurs in estimator fitting. If set to ‘raise’, the error is raised. If a numeric value is given, FitFailedWarning is raised.
Added in version 0.20.
Parameters to pass to the fit method of the estimator.
Deprecated since version 1.6: This parameter is deprecated and will be removed in version 1.8. Use params instead.
Parameters to pass to the estimator, scorer and cross-validation object.
enable_metadata_routing=False (default): Parameters directly passed to the fit method of the estimator.enable_metadata_routing=True: Parameters safely routed to the fit method of the estimator, to the scorer and to the cross-validation object. See Metadata Routing User Guide for more details.Added in version 1.6.
Scores on training sets.
Scores on test set.
See Effect of model regularization on training and test error
>>> import numpy as np
>>> from sklearn.datasets import make_classification
>>> from sklearn.model_selection import validation_curve
>>> from sklearn.linear_model import LogisticRegression
>>> X, y = make_classification(n_samples=1_000, random_state=0)
>>> logistic_regression = LogisticRegression()
>>> param_name, param_range = "C", np.logspace(-8, 3, 10)
>>> train_scores, test_scores = validation_curve(
... logistic_regression, X, y, param_name=param_name, param_range=param_range
... )
>>> print(f"The average train accuracy is {train_scores.mean():.2f}")
The average train accuracy is 0.81
>>> print(f"The average test accuracy is {test_scores.mean():.2f}")
The average test accuracy is 0.81
© 2007–2025 The scikit-learn developers
Licensed under the 3-clause BSD License.
https://scikit-learn.org/1.6/modules/generated/sklearn.model_selection.validation_curve.html