Liu Yang, Rong Jin, Rahul Sukthankar, Yi Liu
Learning application-specific distance metrics from labeled data is critical for both statistical classification and information retrieval. Most of the earlier work in this area has focused on finding metrics that simultaneously optimize compactness and separability in a global sense. Specifically, such distance metrics attempt to keep all of the data points in each class close together while ensuring that data points from different classes are separated. However, particularly when classes exhibit multimodal data distributions, these goals conflict and thus cannot be simultaneously satisfied. This paper proposes a Local Distance Metric (LDM) that aims to optimize local compactness and local separability. We present an efficient algorithm that employs eigenvector analysis and bound optimization to learn the LDMfrom training data in a probabilistic framework. We demonstrate that LDM achieves significant improvements in both classification and retrieval accuracy compared to global distance learning and kernel-based KNN.
Subjects: 12. Machine Learning and Discovery; 12. Machine Learning and Discovery