Create a callable to select columns to be used with ColumnTransformer.
make_column_selector can select columns based on datatype or the columns name with a regex. When using multiple selection criteria, all criteria must match for a column to be selected.
For an example of how to use make_column_selector within a ColumnTransformer to select columns based on data type (i.e. dtype), refer to Column Transformer with Mixed Types.
Name of columns containing this regex pattern will be included. If None, column selection will not be selected based on pattern.
A selection of dtypes to include. For more details, see pandas.DataFrame.select_dtypes.
A selection of dtypes to exclude. For more details, see pandas.DataFrame.select_dtypes.
Callable for column selection to be used by a ColumnTransformer.
See also
ColumnTransformerClass that allows combining the outputs of multiple transformer objects used on column subsets of the data into a single feature space.
>>> from sklearn.preprocessing import StandardScaler, OneHotEncoder
>>> from sklearn.compose import make_column_transformer
>>> from sklearn.compose import make_column_selector
>>> import numpy as np
>>> import pandas as pd
>>> X = pd.DataFrame({'city': ['London', 'London', 'Paris', 'Sallisaw'],
... 'rating': [5, 3, 4, 5]})
>>> ct = make_column_transformer(
... (StandardScaler(),
... make_column_selector(dtype_include=np.number)), # rating
... (OneHotEncoder(),
... make_column_selector(dtype_include=object))) # city
>>> ct.fit_transform(X)
array([[ 0.90453403, 1. , 0. , 0. ],
[-1.50755672, 1. , 0. , 0. ],
[-0.30151134, 0. , 1. , 0. ],
[ 0.90453403, 0. , 0. , 1. ]])
Callable for column selection to be used by a ColumnTransformer.
DataFrame to select columns from.
© 2007–2025 The scikit-learn developers
Licensed under the 3-clause BSD License.
https://scikit-learn.org/1.6/modules/generated/sklearn.compose.make_column_selector.html