class sklearn.model_selection.GroupKFold(n_splits=’warn’)
[source]
K-fold iterator variant with non-overlapping groups.
The same group will not appear in two different folds (the number of distinct groups has to be at least equal to the number of folds).
The folds are approximately balanced in the sense that the number of distinct groups is approximately the same in each fold.
Parameters: |
|
---|
See also
LeaveOneGroupOut
>>> from sklearn.model_selection import GroupKFold >>> X = np.array([[1, 2], [3, 4], [5, 6], [7, 8]]) >>> y = np.array([1, 2, 3, 4]) >>> groups = np.array([0, 0, 2, 2]) >>> group_kfold = GroupKFold(n_splits=2) >>> group_kfold.get_n_splits(X, y, groups) 2 >>> print(group_kfold) GroupKFold(n_splits=2) >>> for train_index, test_index in group_kfold.split(X, y, groups): ... print("TRAIN:", train_index, "TEST:", test_index) ... X_train, X_test = X[train_index], X[test_index] ... y_train, y_test = y[train_index], y[test_index] ... print(X_train, X_test, y_train, y_test) ... TRAIN: [0 1] TEST: [2 3] [[1 2] [3 4]] [[5 6] [7 8]] [1 2] [3 4] TRAIN: [2 3] TEST: [0 1] [[5 6] [7 8]] [[1 2] [3 4]] [3 4] [1 2]
get_n_splits ([X, y, groups]) | Returns the number of splitting iterations in the cross-validator |
split (X[, y, groups]) | Generate indices to split data into training and test set. |
__init__(n_splits=’warn’)
[source]
get_n_splits(X=None, y=None, groups=None)
[source]
Returns the number of splitting iterations in the cross-validator
Parameters: |
|
---|---|
Returns: |
|
split(X, y=None, groups=None)
[source]
Generate indices to split data into training and test set.
Parameters: |
|
---|---|
Yields: |
|
sklearn.model_selection.GroupKFold
© 2007–2018 The scikit-learn developers
Licensed under the 3-clause BSD License.
http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GroupKFold.html