Quality of Cluster Index Based on Study of Decision Tree

Download Full Text
B.Rajasekhar, B. Sunil Kumar, Rajesh Vibhudi, B.V. Rama Krishna
Published Date:
December 30, 2011
Volume 2, Issue 1
39 - 43

clustering, classification, decision tree, k-means
B.Rajasekhar, B. Sunil Kumar, Rajesh Vibhudi, B.V. Rama Krishna, "Quality of Cluster Index Based on Study of Decision Tree ". International Journal of Research in Computer Science, 2 (1): pp. 39-43, December 2011. doi:10.7815/ijorcs.21.2011.013 Other Formats


Quality of clustering is an important issue in application of clustering techniques. Most traditional cluster validity indices are geometry-based cluster quality measures. This work proposes a cluster validity index based on the decision-theoretic rough set model by considering various loss functions. Real time retail data show the usefulness of the proposed validity index for the evaluation of rough and crisp clustering. The measure is shown to help determine optimal number of clusters, as well as an important parameter called threshold in rough clustering. The experiments with a promotional campaign for the retail data illustrate the ability of the proposed measure to incorporate financial considerations in evaluating quality of a clustering scheme. This ability to deal with monetary values distinguishes the proposed decision-theoretic measure from other distance-based measures. Our proposed system validity index can also be efficient for evaluating other clustering algorithms such as fuzzy clustering.

  1. J.C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, 1981. doi:10.1007/978-1-4757-0450-1_5
  2. D.L. Davies and D.W. Bouldin, “A Cluster Separation Measure,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 1, no 2, pp. 224-227, Apr. 1979. doi:10.1109/TPAMI.1979.4766909
  3. J.C. Dunn, “Well Separated Clusters and Optimal Fuzzy Partitions,” J. Cybernetics, vol. 4, pp. 95-104, 1974. doi:10.1080/01969727408546059
  4. S. Hirano and S. Tsumoto, “On Constructing Clusters from Non- Euclidean Dissimilarity Matrix by Using Rough Clustering,” Proc. Japanese Soc. for Artificial Intelligence (JSAI) Workshops, pp. 5-16, 2005.
  5. T.B. Ho and N.B. Nguyen, “Nonhierarchical Document Clustering by a Tolerance Rough Set Model,” Int’l J. Intelligent Systems, vol. 17, no. 2, pp. 199-212, 2002. doi:10.1002/int.10016
  6. Rough Cluster Quality Index Based on Decision Theory Pawan Lingras, Member, IEEE, Min Chen, and Duoqian Miao IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 21, NO. 7, JULY 2009. doi:10.1109/TKDE.2008.236
  7. W. Pedrycz and J. Waletzky, “Fuzzy Clustering with Partial Supervision,” IEEE Trans. Systems, Man, and Cybernetics, vol. 27, no. 5, pp. 787-795, Sept. 1997. doi:10.1109/3477.623232
  8. Partition Algorithms– A Study and Emergence of Mining Projected Clusters in High-Dimensional Dataset-International Journal of Computer Science and Telecommunications [Volume 2, Issue 4, July 2011]
  9. Jensen, D. D. and Cohen, P. R (1999), "Multiple Comparisons in Induction Algorithms," Machine Learning (to appear). Excellent discussion of bias inherent in selecting an input. Explore http://www.cs.umass.edu/~jensen/papers.

  • Jayabrabu, R., V. Saravanan, and J. Jebamalar Tamilselvi. "A framework for fraud detection system in automated data mining using intelligent agent for better decision making process." Green Computing Communication and Electrical Engineering (ICGCCEE), 2014 International Conference on. IEEE, 2014.