Meta-transformer for selecting features based on importance weights.
Added in version 0.17.
Read more in the User Guide.
The base estimator from which the transformer is built. This can be both a fitted (if prefit is set to True) or a non-fitted estimator. The estimator should have a feature_importances_ or coef_ attribute after fitting. Otherwise, the importance_getter parameter should be used.
The threshold value to use for feature selection. Features whose absolute importance value is greater or equal are kept while the others are discarded. If “median” (resp. “mean”), then the threshold value is the median (resp. the mean) of the feature importances. A scaling factor (e.g., “1.25*mean”) may also be used. If None and if the estimator has a parameter penalty set to l1, either explicitly or implicitly (e.g, Lasso), the threshold used is 1e-5. Otherwise, “mean” is used by default.
Whether a prefit model is expected to be passed into the constructor directly or not. If True, estimator must be a fitted estimator. If False, estimator is fitted and updated by calling fit and partial_fit, respectively.
Order of the norm used to filter the vectors of coefficients below threshold in the case where the coef_ attribute of the estimator is of dimension 2.
The maximum number of features to select.
max_features(X).None, then all features are kept.To only select based on max_features, set threshold=-np.inf.
Added in version 0.20.
Changed in version 1.1: max_features accepts a callable.
If ‘auto’, uses the feature importance either through a coef_ attribute or feature_importances_ attribute of estimator.
Also accepts a string that specifies an attribute name/path for extracting feature importance (implemented with attrgetter). For example, give regressor_.coef_ in case of TransformedTargetRegressor or named_steps.clf.feature_importances_ in case of Pipeline with its last step named clf.
If callable, overrides the default feature importance getter. The callable is passed with the fitted estimator and it should return importance for each feature.
Added in version 0.24.
The base estimator from which the transformer is built. This attribute exist only when fit has been called.
prefit=True, it is a deep copy of estimator.prefit=False, it is a clone of estimator and fit on the data passed to fit or partial_fit.n_features_in_int
Number of features seen during fit.
Maximum number of features calculated during fit. Only defined if the max_features is not None.
max_features is an int, then max_features_ = max_features.max_features is a callable, then max_features_ = max_features(X).Added in version 1.1.
n_features_in_,)
Names of features seen during fit. Defined only when X has feature names that are all strings.
Added in version 1.0.
threshold_float
Threshold value used for feature selection.
See also
RFERecursive feature elimination based on importance weights.
RFECVRecursive feature elimination with built-in cross-validated selection of the best number of features.
SequentialFeatureSelectorSequential cross-validation based feature selection. Does not rely on importance weights.
Allows NaN/Inf in the input if the underlying estimator does as well.
>>> from sklearn.feature_selection import SelectFromModel
>>> from sklearn.linear_model import LogisticRegression
>>> X = [[ 0.87, -1.34, 0.31 ],
... [-2.79, -0.02, -0.85 ],
... [-1.34, -0.48, -2.55 ],
... [ 1.92, 1.48, 0.65 ]]
>>> y = [0, 1, 0, 1]
>>> selector = SelectFromModel(estimator=LogisticRegression()).fit(X, y)
>>> selector.estimator_.coef_
array([[-0.3252..., 0.8345..., 0.4976...]])
>>> selector.threshold_
np.float64(0.55249...)
>>> selector.get_support()
array([False, True, False])
>>> selector.transform(X)
array([[-1.34],
[-0.02],
[-0.48],
[ 1.48]])
Using a callable to create a selector that can use no more than half of the input features.
>>> def half_callable(X): ... return round(len(X[0]) / 2) >>> half_selector = SelectFromModel(estimator=LogisticRegression(), ... max_features=half_callable) >>> _ = half_selector.fit(X, y) >>> half_selector.max_features_ 2
Fit the SelectFromModel meta-transformer.
The training input samples.
The target values (integers that correspond to classes in classification, real numbers in regression).
enable_metadata_routing=False (default): Parameters directly passed to the fit method of the sub-estimator. They are ignored if prefit=True.enable_metadata_routing=True: Parameters safely routed to the fit method of the sub-estimator. They are ignored if prefit=True.Changed in version 1.4: See Metadata Routing User Guide for more details.
Fitted estimator.
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
Input samples.
Target values (None for unsupervised transformations).
Additional fit parameters.
Transformed array.
Mask feature names according to selected features.
Input features.
input_features is None, then feature_names_in_ is used as feature names in. If feature_names_in_ is not defined, then the following input feature names are generated: ["x0", "x1", ..., "x(n_features_in_ - 1)"].input_features is an array-like, then input_features must match feature_names_in_ if feature_names_in_ is defined.Transformed feature names.
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
Added in version 1.4.
A MetadataRouter encapsulating routing information.
Get parameters for this estimator.
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Parameter names mapped to their values.
Get a mask, or integer index, of the features selected.
If True, the return value will be an array of integers, rather than a boolean mask.
An index that selects the retained features from a feature vector. If indices is False, this is a boolean array of shape [# input features], in which an element is True iff its corresponding feature is selected for retention. If indices is True, this is an integer array of shape [# output features] whose values are indices into the input feature vector.
Reverse the transformation operation.
The input samples.
X with columns of zeros inserted where features would have been removed by transform.
Number of features seen during fit.
Fit the SelectFromModel meta-transformer only once.
The training input samples.
The target values (integers that correspond to classes in classification, real numbers in regression).
enable_metadata_routing=False (default): Parameters directly passed to the partial_fit method of the sub-estimator.enable_metadata_routing=True: Parameters passed to the partial_fit method of the sub-estimator. They are ignored if prefit=True.Changed in version 1.4: **partial_fit_params are routed to the sub-estimator, if enable_metadata_routing=True is set via set_config, which allows for aliasing.
See Metadata Routing User Guide for more details.
Fitted estimator.
Set output container.
See Introducing the set_output API for an example on how to use the API.
Configure output of transform and fit_transform.
"default": Default output format of a transformer"pandas": DataFrame output"polars": Polars outputNone: Transform configuration is unchangedAdded in version 1.4: "polars" option was added.
Estimator instance.
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.
Estimator parameters.
Estimator instance.
Threshold value used for feature selection.
Reduce X to the selected features.
The input samples.
The input samples with only the selected features.
© 2007–2025 The scikit-learn developers
Licensed under the 3-clause BSD License.
https://scikit-learn.org/1.6/modules/generated/sklearn.feature_selection.SelectFromModel.html