Set-Similarity
-
Overlap Coefficient
Set similarity metric: intersection / smaller set size. Asymmetric variant of Jaccard emphasizing subset containment.
-
MinHash
Probabilistic set similarity estimation via minimal hash values. Enables fast approximate Jaccard similarity in streaming or large-scale settings.
-
Jaccard Similarity
Set overlap metric: intersection / union. Measures similarity of sets without regard to order or duplicates.
-
Dice Coefficient
Set similarity metric: twice shared elements / total elements in both sets. Related to Jaccard but emphasizes intersection differently.