W3cubDocs

homogeneity_score

sklearn.metrics.homogeneity_score(labels_true, labels_pred)[source]

Homogeneity metric of a cluster labeling given a ground truth.

A clustering result satisfies homogeneity if all of its clusters contain only data points which are members of a single class.

This metric is independent of the absolute values of the labels: a permutation of the class or cluster label values won’t change the score value in any way.

This metric is not symmetric: switching label_true with label_pred will return the completeness_score which will be different in general.

References

[1]

Andrew Rosenberg and Julia Hirschberg, 2007. V-Measure: A conditional entropy-based external cluster evaluation measure

Examples

Perfect labelings are homogeneous:

>>> from sklearn.metrics.cluster import homogeneity_score
>>> homogeneity_score([0, 0, 1, 1], [1, 1, 0, 0])
np.float64(1.0)

Non-perfect labelings that further split classes into more clusters can be perfectly homogeneous:

>>> print("%.6f" % homogeneity_score([0, 0, 1, 1], [0, 0, 1, 2]))
1.000000
>>> print("%.6f" % homogeneity_score([0, 0, 1, 1], [0, 1, 2, 3]))
1.000000

Clusters that include samples from different classes do not make for an homogeneous labeling:

>>> print("%.6f" % homogeneity_score([0, 0, 1, 1], [0, 1, 0, 1]))
0.0...
>>> print("%.6f" % homogeneity_score([0, 0, 1, 1], [0, 0, 0, 0]))
0.0...

Gallery examples

A demo of K-Means clustering on the handwritten digits data

Demo of DBSCAN clustering algorithm

Demo of affinity propagation clustering algorithm

Clustering text documents using k-means

© 2007–2025 The scikit-learn developers
Licensed under the 3-clause BSD License.
https://scikit-learn.org/1.6/modules/generated/sklearn.metrics.homogeneity_score.html