class sklearn.neighbors.NearestCentroid(metric=’euclidean’, shrink_threshold=None)
[source]
Nearest centroid classifier.
Each class is represented by its centroid, with test samples classified to the class with the nearest centroid.
Read more in the User Guide.
Parameters: 


Attributes: 

See also
sklearn.neighbors.KNeighborsClassifier
When used for text classification with tfidf vectors, this classifier is also known as the Rocchio classifier.
Tibshirani, R., Hastie, T., Narasimhan, B., & Chu, G. (2002). Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proceedings of the National Academy of Sciences of the United States of America, 99(10), 65676572. The National Academy of Sciences.
>>> from sklearn.neighbors.nearest_centroid import NearestCentroid >>> import numpy as np >>> X = np.array([[1, 1], [2, 1], [3, 2], [1, 1], [2, 1], [3, 2]]) >>> y = np.array([1, 1, 1, 2, 2, 2]) >>> clf = NearestCentroid() >>> clf.fit(X, y) NearestCentroid(metric='euclidean', shrink_threshold=None) >>> print(clf.predict([[0.8, 1]])) [1]
fit (X, y)  Fit the NearestCentroid model according to the given training data. 
get_params ([deep])  Get parameters for this estimator. 
predict (X)  Perform classification on an array of test vectors X. 
score (X, y[, sample_weight])  Returns the mean accuracy on the given test data and labels. 
set_params (**params)  Set the parameters of this estimator. 
__init__(metric=’euclidean’, shrink_threshold=None)
[source]
fit(X, y)
[source]
Fit the NearestCentroid model according to the given training data.
Parameters: 


get_params(deep=True)
[source]
Get parameters for this estimator.
Parameters: 


Returns: 

predict(X)
[source]
Perform classification on an array of test vectors X.
The predicted class C for each sample in X is returned.
Parameters: 


Returns: 

If the metric constructor parameter is “precomputed”, X is assumed to be the distance matrix between the data to be predicted and self.centroids_
.
score(X, y, sample_weight=None)
[source]
Returns the mean accuracy on the given test data and labels.
In multilabel classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
Parameters: 


Returns: 

set_params(**params)
[source]
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter>
so that it’s possible to update each component of a nested object.
Returns: 


sklearn.neighbors.NearestCentroid
© 2007–2018 The scikitlearn developers
Licensed under the 3clause BSD License.
http://scikitlearn.org/stable/modules/generated/sklearn.neighbors.NearestCentroid.html