Rand indices¶

Rand index is a measure of the similarity between two data clusterings. From a mathematical standpoint, Rand index is related to the accuracy, but is applicable even when class labels are not used.

References:

Lawrence Hubert and Phipps Arabie (1985). Comparing partitions. Journal of Classification 2 (1): 193–218

Meila, Marina (2003). Comparing Clusterings by the Variation of Information. Learning Theory and Kernel Machines: 173–187.

This package provides the randindex function that implements several metrics:

randindex(c1, c2)¶

Compute the tuple of indices (Adjusted Rand index, Rand index, Mirkin’s index, Hubert’s index) between two assignments.

Parameters:	c1 – The assignment vector for the first clustering. c2 – The assignment vector for the second clustering.
Returns:	tuple of indices.

randindex(R, c0): This method takes R, an instance of ClusteringResult, as input, and computes the tuple of indices (see above) where c0 is the corresponding assignment vector.

randindex(R1, R2): This method takes R1 and R2 (both are instances of ClusteringResult) and computes the tuple of indices (see above) between them.