A PSO-Based Subtractive Data Clustering Algorithm

Download Full Text
Mariam El-Tarabily, Rehab Abdel-Kader, Mahmoud Marie, Gamal Abdel-Azeem
Published Date:
March 05, 2013
Volume 3, Issue 2
1 - 9

data clustering, subtractive clustering, particle swarm optimization, subtractive algorithm, hybrid algorithm
Mariam El-Tarabily, Rehab Abdel-Kader, Mahmoud Marie, Gamal Abdel-Azeem, "A PSO-Based Subtractive Data Clustering Algorithm". International Journal of Research in Computer Science, 3 (2): pp. 1-9, March 2013. doi:10.7815/ijorcs.32.2013.060 Other Formats


There is a tremendous proliferation in the amount of information available on the largest shared information source, the World Wide Web. Fast and high-quality clustering algorithms play an important role in helping users to effectively navigate, summarize, and organize the information. Recent studies have shown that partitional clustering algorithms such as the k-means algorithm are the most popular algorithms for clustering large datasets. The major problem with partitional clustering algorithms is that they are sensitive to the selection of the initial partitions and are prone to premature converge to local optima. Subtractive clustering is a fast, one-pass algorithm for estimating the number of clusters and cluster centers for any given set of data. The cluster estimates can be used to initialize iterative optimization-based clustering methods and model identification methods. In this paper, we present a hybrid Particle Swarm Optimization, Subtractive + (PSO) clustering algorithm that performs fast clustering. For comparison purpose, we applied the Subtractive + (PSO) clustering algorithm, PSO, and the Subtractive clustering algorithms on three different datasets. The results illustrate that the Subtractive + (PSO) clustering algorithm can generate the most compact clustering results as compared to other algorithms.

  1. Khaled S. Al-Sultana, M. Maroof Khan, "Computational experience on four algorithms for the hard clustering problem". Pattern Recognition Letter, Vol.17, No.3, pp.295–308, 1996. doi: 10.1016/0167-8655(95)00122-0
  2. Michael R. Anderberg , "Cluster Analysis for Applications". Academic Press Inc., New York, 1973.
  3. Pavel Berkhin, "Survey of clustering data mining techniques". Accrue Software Research Paper, pp.25-71, 2002. doi: 10.1007/3-540-28349-8_2
  4. A. Carlisle, G. Dozier, "An Off-The- Shelf PSO". In Proceedings of the Particle Swarm Optimization Workshop, 2001, PP: 1-6.
  5. Krzysztof J. Cios, Witold Pedrycz, Roman W. Swiniarski, "Data Mining – Methods for Knowledge Discovery". Kluwer Academic Publishers, 1998. doi: 10.1007/978-1-4615-5589-6
  6. X. Cui, P. Palathingal, T.E. Potok, "Document Clustering using Particle Swarm Optimization". IEEE Swarm Intelligence Symposium 2005, Pasadena, California, pp. 185 - 191. doi: 10.1109/SIS.2005.1501621
  7. Eberhart, R.C., Shi, Y. "Comparing Inertia Weights and Constriction Factors in Particle Swarm Optimization". Congress on Evolutionary Computing, vol. 1, 2000, pp: 84-88. doi: 10.1109/CEC.2000.870279
  8. Everitt, B. "Cluster Analysis". 2nd Edition, Halsted Press, New York, 1980.
  9. A. K. Jain , M. N. Murty , P. J. Flynn, "Data Clustering: A Review". ACM ComputingSurvey, Vol. 31, No. 3, pp: 264-323, 1999. doi: 10.1145/331499.331504
  10. J. A. Hartigan, "Clustering Algorithms". John Wiley and Sons, Inc., New York, 1975.
  11. Eberhart RC, Shi Y, Kennedy J, "Swarm Intelligence". Morgan Kaufmann, New York, 2001.
  12. Mahamed G. Omran, Ayed Salman, Andries P. Engelbrecht, "Image classification using particle swarm optimization". Proceedings of the 4th Asia-Pacific Conference on Simulated Evolution and Learning 2002, Singapore, pp: 370-374. doi: 10.1142/9789812561794_0019
  13. S. L. Chiu, "Fuzzy model identification based on cluster estimation". Journal of Intelligent and Fuzzy Systems, Vol. 2, No. 3, 1994.
  14. Salton G. and Buckley C., "Term-weighting approaches in automatic text retrieval". Information Processing and Management, Vol. 24, No. 5, pp: 513-523, 1988. doi: 10.1016/0306-4573(88)90021-0
  15. Song Liangtu, Zhang Xiaoming, "Web Text Feature Extraction with Particle Swarm Optimization". IJCSNS International Journal of Computer Science and Network Security, Vol. 7, No. 6, 2007.
  16. Selim, Shokri Z., "K-means type algorithms: A generalized convergence theorem and characterization of local optimality". Pattern Analysis and Machine Intelligence, IEEE Transactions Vol. 6, No.1, pp:81–87, 1984. doi: 10.1109/TPAMI.1984.4767478
  17. Yuhui Shi, Russell C. Eberhart, "Parameter Selection in Particle Swarm Optimization". The 7th Annual Conference on Evolutionary Programming, San Diego, pp. pp 591-600, 1998. doi: 10.1007/BFb0040810
  18. Michael Steinbach, George Karypis, Vipin Kumar, "A Comparison of Document Clustering Techniques". TextMining Workshop, KDD, 2000.
  19. Razan Alwee, Siti Mariyam, Firdaus Aziz, K.H.Chey, Haza Nuzly, "The Impact of Social Network Structure in Particle Swarm Optimization for Classification Problems". International Journal of Soft Computing , Vol. 4, No. 4, 2009, pp:151-156.
  20. Van D. M., Engelbrecht. A.P., "Data clustering using particle swarm optimization". Proceedings of IEEE Congress on Evolutionary Computation 2003, Canbella, Australia. pp: 215-220. doi: 10.1109/CEC.2003.1299577
  21. Sherin M. Youssef, Mohamed Rizk, Mohamed El-Sherif, "Dynamically Adaptive Data Clustering Using Intelligent Swarm-like Agents". International Journal of Mathematics and Computer in simulation, Vol. 1, No.2, 2007.
  22. Rehab F. Abdel-Kader, "Genetically Improved PSO Algorithm for Efficient Data Clustering". Proceeding Second International Conference on Machine Learning and Computing 2010, pp.71-75. doi: 10.1109/ICMLC.2010.19
  23. UCI Repository of Machine Learning Databases. http://www.ics.uci .edu/~mlearn/MLRepository.html .
  24. K. Premalatha, A.M. Natarajan, "Discrete PSO with GA Operators for Document Clustering". International Journal of Recent Trends in Engineering, Vol. 1, No. 1, 2009.
  25. JunYing Chen, Zheng Qin, Ji Jia, "A Weighted Mean Subtractive Clustering Algorithm". Information Technology Journal, No. 7, pp.356-360, 2008. doi: 10.3923/itj.2008.356.360
  26. Neveen I. Ghali, Nahed El-dessouki, Mervat A. N, Lamiaa Bakraw, "Exponential Particle Swarm Optimization Approach for Improving Data Clustering". International Journal of Electrical & Electronics Engineering, Vol. 3, Issue 4, May 2009.
  27. Vijay Kalivarapu, Jung-Leng Foo, Eliot Winer, "Improving solution characteristics of particle swarm optimization using digital pheromones". Structural and Multidisciplinary Optimization - STRUCT MULTIDISCIP OPTIM, Vol. 37, No. 4, pp: 415-427, 2009. doi: 10.1007/s00158-008-0240-9
  28. H. Izakian, A. Abraham, and V. Snásel, "Fuzzy Clustering using Hybrid Fuzzy c-means and Fuzzy Particle Swarm Optimization", World Congress on Nature & Biologically Inspired Computing, NaBIC 2009. In Proc. NaBIC, pp.1690-1694, 2009. doi: 10.1109/NABIC.2009.5393618

  • Ghorpade-Aher, Jayshree, and Vishakha Arun Metre. "PSO based Multidimensional Data Clustering: A Survey." dimensions 87.16 (2014).
  • Ghorpade-Aher, Jayshree, and Vishakha A. Metre. "Clustering Multidimensional Data with PSO based Algorithm." arXiv preprint arXiv:1402.6428 (2014).
  • Ghorpade-Aher, Jayshree, and Vishakha A. Metre. "Data clustering using an advanced PSO variant." India Conference (INDICON), 2014 Annual IEEE. IEEE, 2014.
  • Ghorpade-Aher, Jayshree, and Roshan Bagdiya. "A Review on Clustering Web data using PSO." International Journal of Computer Applications 108.6 (2014).
  • Sethi, Chetna, and Garima Mishra, "A Linear PCA based hybrid K-Means PSO algorithm for clustering large dataset", International Journal of Scientific & Engineering Research, Volume 4, Issue 6, June-2013.
  • Poonam, Ms, Ms Neelam Oberoi, and Ambala Saddopur. "EMPIRICAL EVALUATION OF BBBC AND PSO ALGORITHM FOR DATA CLUSTERING", International Journal of Computing and Corporate Research, Vol 4, Issue 1, January 2014.