A Comparative Study on Distance Measuring Approaches for Clustering

Download Full Text
Shraddha Pandit, Suchita Gupta
Published Date:
December 30, 2011
Volume 2, Issue 1
29 - 31

clustering, distance measure, clustering algorithms
Shraddha Pandit, Suchita Gupta, "A Comparative Study on Distance Measuring Approaches for Clustering". International Journal of Research in Computer Science, 2 (1): pp. 29-31, December 2011. doi:10.7815/ijorcs.21.2011.011 Other Formats


Clustering plays a vital role in the various areas of research like Data Mining, Image Retrieval, Bio-computing and many a lot. Distance measure plays an important role in clustering data points. Choosing the right distance measure for a given dataset is a biggest challenge. In this paper, we study various distance measures and their effect on different clustering. This paper surveys existing distance measures for clustering and present a comparison between them based on application domain, efficiency, benefits and drawbacks. This comparison helps the researchers to take quick decision about which distance measure to use for clustering. We conclude this work by identifying trends and challenges of research and development towards clustering.

  1. Ankita Vimal, Satyanarayana R Valluri, Kamalakar Karlapalem , “An Experiment with Distance Measures for Clustering” , Technical Report: IIIT/TR/2008/132
  2. John W. Ratcliff and David E. Metzener, Pattern Matching: The Gestalt Approach, DR. DOBB’S JOURNAL, 1998, p. 46.
  3. Martin Ester Hans-Peter Kriegel Jrg Sander and Xiaowei Xu, A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise, AAAI Press, 1996, pp. 226–231.
  4. Bar-Hilel, A., Hertz, T., Shental, N., & Weinshall, D. (2003). Learning distance functions using equivalence
  5. Fukunaga, K. (1990). Statistical pattern recognition. San Diego: Academic Press. 2nd edition.
  6. Rui Xu, Donald Wunsch “Survey of Clustering Algorithms” IEEE Transactions on Neural Networks , VOL. 16, NO. 3, MAY 2005. doi:10.1109/TNN.2005.845141
  7. http://en.wikipedia.org/wiki/Data clustering
  8. http://en.wikipedia.org/wiki/K-means
  9. http://en.wikipedia.org/wiki/ DBSCAN
  10. http://en.wikipedia.org/wiki/Jaccard index
  11. http://en.wikipedia.org/wiki/Dice coefficient

  • Kekre, H. B., Tanuja K. Sarode, and Jagruti K. Save. "Effect of Distance Measures on Transform Based Image Classification." International Journal of Engineering Science and Technology 4.8 (2012).
  • Zhang, Yu, et al. "Encoding local binary descriptors by bag-of-features with hamming distance for visual object categorization." Advances in Information Retrieval. Springer Berlin Heidelberg, 2013. 630-641.
  • Seitzer, Phillip, Tu Anh Huynh, and Marc T. Facciotti. "JContextExplorer: a tree-based approach to facilitate cross-species genomic context comparison." BMC bioinformatics 14.1 (2013): 18.
  • Paskaleva, Biliana, and Pavel Bochev. "A vector space model for information retrieval with generalized similarity measures." unpublished work (2011).
  • Shoaib, Muhammad, Ali Daud, and Malik Sikandar Hayat Khiyal. "Improving Similarity Measures for Publications with Special Focus on Author Name Disambiguation." Arabian Journal for Science and Engineering 40.6 (2015): 1591-1605.
  • Zhang, Yu. Contribution to concept detection on images using visual and textual descriptors. Diss. Ecully, Ecole centrale de Lyon, 2014.
  • Charulatha, B. S., Paul Rodrigues, and T. Chitralekha. "A Comparative study of different distance metrics that can be used in Fuzzy Clustering Algorithms."
  • Paskaleva, Biliana Stefanova, Pavel B. Bochev, and Arlo Leroy Ames. An extended vector space model for information retrieval with generalized similarity measures: theory and applications. No. SAND2012-8069. Sandia National Laboratories, 2012.
  • Elgamel, Mohamed, and Abdulhalim Dandoush. "A novel modified Manhattan Distance with application for Localization Algorithms in ad-hoc WSNs." Ad Hoc Networks (2015).