Linear Discriminant Analysis.
A classifier with a linear decision boundary, generated by fitting class conditional densities to the data and using Bayes’ rule.
The model fits a Gaussian density to each class, assuming that all classes share the same covariance matrix.
The fitted model can also be used to reduce the dimensionality of the input by projecting it to the most discriminative directions, using the transform method.
Added in version 0.17.
For a comparison between LinearDiscriminantAnalysis and QuadraticDiscriminantAnalysis, see Linear and Quadratic Discriminant Analysis with covariance ellipsoid.
Read more in the User Guide.
Changed in version 1.2: solver="svd" now has experimental Array API support. See the Array API User Guide for more details.
This should be left to None if covariance_estimator is used. Note that shrinkage works only with ‘lsqr’ and ‘eigen’ solvers.
For a usage example, see Normal, Ledoit-Wolf and OAS Linear Discriminant Analysis for classification.
The class prior probabilities. By default, the class proportions are inferred from the training data.
Number of components (<= min(n_classes - 1, n_features)) for dimensionality reduction. If None, will be set to min(n_classes - 1, n_features). This parameter only affects the transform method.
For a usage example, see Comparison of LDA and PCA 2D projection of Iris dataset.
If True, explicitly compute the weighted within-class covariance matrix when solver is ‘svd’. The matrix is always computed and stored for the other solvers.
Added in version 0.17.
Absolute threshold for a singular value of X to be considered significant, used to estimate the rank of X. Dimensions whose singular values are non-significant are discarded. Only used if solver is ‘svd’.
Added in version 0.17.
If not None, covariance_estimator is used to estimate the covariance matrices instead of relying on the empirical covariance estimator (with potential shrinkage). The object should have a fit method and a covariance_ attribute like the estimators in sklearn.covariance. if None the shrinkage parameter drives the estimate.
This should be left to None if shrinkage is used. Note that covariance_estimator works only with ‘lsqr’ and ‘eigen’ solvers.
Added in version 0.24.
Weight vector(s).
Intercept term.
Weighted within-class covariance matrix. It corresponds to sum_k prior_k * C_k where C_k is the covariance matrix of the samples in class k. The C_k are estimated using the (potentially shrunk) biased estimator of covariance. If solver is ‘svd’, only exists when store_covariance is True.
Percentage of variance explained by each of the selected components. If n_components is not set then all components are stored and the sum of explained variances is equal to 1.0. Only available when eigen or svd solver is used.
Class-wise means.
Class priors (sum to 1).
Scaling of the features in the space spanned by the class centroids. Only available for ‘svd’ and ‘eigen’ solvers.
Overall mean. Only present if solver is ‘svd’.
Unique class labels.
Number of features seen during fit.
Added in version 0.24.
n_features_in_,)
Names of features seen during fit. Defined only when X has feature names that are all strings.
Added in version 1.0.
See also
QuadraticDiscriminantAnalysisQuadratic Discriminant Analysis.
>>> import numpy as np >>> from sklearn.discriminant_analysis import LinearDiscriminantAnalysis >>> X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]]) >>> y = np.array([1, 1, 1, 2, 2, 2]) >>> clf = LinearDiscriminantAnalysis() >>> clf.fit(X, y) LinearDiscriminantAnalysis() >>> print(clf.predict([[-0.8, -1]])) [1]
Apply decision function to an array of samples.
The decision function is equal (up to a constant factor) to the log-posterior of the model, i.e. log p(y = k | x). In a binary classification setting this instead corresponds to the difference log p(y = 1 | x) - log p(y = 0 | x). See Mathematical formulation of the LDA and QDA classifiers.
Array of samples (test vectors).
Decision function values related to each class, per sample. In the two-class case, the shape is (n_samples,), giving the log likelihood ratio of the positive class.
Fit the Linear Discriminant Analysis model.
Changed in version 0.19: store_covariance and tol has been moved to main constructor.
Training data.
Target values.
Fitted estimator.
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
Input samples.
Target values (None for unsupervised transformations).
Additional fit parameters.
Transformed array.
Get output feature names for transformation.
The feature names out will prefixed by the lowercased class name. For example, if the transformer outputs 3 features, then the feature names out are: ["class_name0", "class_name1", "class_name2"].
Only used to validate feature names with the names seen in fit.
Transformed feature names.
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
A MetadataRequest encapsulating routing information.
Get parameters for this estimator.
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Parameter names mapped to their values.
Predict class labels for samples in X.
The data matrix for which we want to get the predictions.
Vector containing the class labels for each sample.
Estimate log probability.
Input data.
Estimated log probabilities.
Estimate probability.
Input data.
Estimated probabilities.
Return the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
Test samples.
True labels for X.
Sample weights.
Mean accuracy of self.predict(X) w.r.t. y.
Set output container.
See Introducing the set_output API for an example on how to use the API.
Configure output of transform and fit_transform.
"default": Default output format of a transformer"pandas": DataFrame output"polars": Polars outputNone: Transform configuration is unchangedAdded in version 1.4: "polars" option was added.
Estimator instance.
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.
Estimator parameters.
Estimator instance.
Request metadata passed to the score method.
Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.
The options for each parameter are:
True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it to score.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.
Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.
Metadata routing for sample_weight parameter in score.
The updated object.
Project data to maximize class separation.
Input data.
Transformed data. In the case of the ‘svd’ solver, the shape is (n_samples, min(rank, n_components)).
© 2007–2025 The scikit-learn developers
Licensed under the 3-clause BSD License.
https://scikit-learn.org/1.6/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html