class sklearn.model_selection.GroupKFold(n_splits=’warn’)
[source]
Kfold iterator variant with nonoverlapping groups.
The same group will not appear in two different folds (the number of distinct groups has to be at least equal to the number of folds).
The folds are approximately balanced in the sense that the number of distinct groups is approximately the same in each fold.
Parameters: 


See also
LeaveOneGroupOut
>>> from sklearn.model_selection import GroupKFold >>> X = np.array([[1, 2], [3, 4], [5, 6], [7, 8]]) >>> y = np.array([1, 2, 3, 4]) >>> groups = np.array([0, 0, 2, 2]) >>> group_kfold = GroupKFold(n_splits=2) >>> group_kfold.get_n_splits(X, y, groups) 2 >>> print(group_kfold) GroupKFold(n_splits=2) >>> for train_index, test_index in group_kfold.split(X, y, groups): ... print("TRAIN:", train_index, "TEST:", test_index) ... X_train, X_test = X[train_index], X[test_index] ... y_train, y_test = y[train_index], y[test_index] ... print(X_train, X_test, y_train, y_test) ... TRAIN: [0 1] TEST: [2 3] [[1 2] [3 4]] [[5 6] [7 8]] [1 2] [3 4] TRAIN: [2 3] TEST: [0 1] [[5 6] [7 8]] [[1 2] [3 4]] [3 4] [1 2]
get_n_splits ([X, y, groups])  Returns the number of splitting iterations in the crossvalidator 
split (X[, y, groups])  Generate indices to split data into training and test set. 
__init__(n_splits=’warn’)
[source]
get_n_splits(X=None, y=None, groups=None)
[source]
Returns the number of splitting iterations in the crossvalidator
Parameters: 


Returns: 

split(X, y=None, groups=None)
[source]
Generate indices to split data into training and test set.
Parameters: 


Yields: 

sklearn.model_selection.GroupKFold
© 2007–2018 The scikitlearn developers
Licensed under the 3clause BSD License.
http://scikitlearn.org/stable/modules/generated/sklearn.model_selection.GroupKFold.html