Hybrid Clustering Algorithm ‘KCu’ for Combining the Features of K-Means and CURE Algorithm for Efficient Outliers Handling

Hybrid Clustering Algorithm ‘KCu’ for Combining the Features of K-Means and CURE Algorithm for Efficient Outliers Handling

B. Renuka DeviS. Pallam Setty

Department of CSE, Vignan’s Nirula Institute of Technology & Science for Women, Guntur 522005, Andhra Pradesh, India

Department of CS & SE, College of Engineering, Andhra University, Andhra Pradesh 530003, India

Corresponding Author Email: 
26 April 2018
| |
2 June 2018
| | Citation



In the ongoing situation, the volume of information expands step by step. By the year 2020 the volume of Big Data would reach up to 40zb according to International Data Corporation (IDC). Big Data has turned out to be prevalent for handling, putting away and overseeing huge volumes of information. The grouping of datasets has turned into a testing issue in the field of Big Data examination; however, there are entanglements for applying conventional bunching calculations to huge information because of expanding the volume of information step by step. In this manuscript a new hybrid clustering algorithm, namely KCu to combine the features of both K-Means and CURE clustering algorithms is proposed. The proposed algorithm first applies k-means on data set and then applies CURE on resultant clusters from k-means. We experimented KCu and we show that, when compared to k-means and Cure. Which gives accurate results because of CURE? CURE can handle outliers and it gives non spherical shapes it is the disadvantage of other clustering algorithm.


big data, clustering, partitioning, hierarchical k-means, CURE hybrid algorithm

1. Introduction
2. Related Work
3. Clustering Techniques
4. Hybrid Clustering Method
5. Results
6. Conclusion

[1] Ramprasad R, Darshika GP. (2017). A fast and scalable FPGA-based parallel processing architecture for k-means clustering for big data analysis. IEEE.

[2] Liu C, Wang CZ, Hu JX, et al. (2017). Improved K-means algorithm based on hybrid rice optimization algorithm. IEEE 21-23.

[3] Xiong CQ, Hua Z, et al. (2016). An improved k-means text clustering algorithm by optimizing initial cluster centers. IEEE.

[4] Karimov J, Ozbayoglu M. (2015). Clustering quality improvement of k-means using a hybrid evolutionary model. Elsevier.

[5] Han JK, Luo M. (2014). Bootstrapping k-means for big data analysis. IEEE International Conference on Big Data.

[6] Anupama C, Suresh K. (2014). An improved k-means clustering algorithm: A step forward for removal of dependency on K. International conference on reliability. Optimization and Information Technology ICROIT 2014, India.

[7] Wang JT, Su XL. (2011). An improved k-means clustering algorithm. IEEE International Conference on Big Data.

[8] Shi N, Liu XM, et al. (2010). Research on k-means clustering algorithm an improved k-means clustering algorithm. IEEE.

[9] Makadiya KN. (2015). An enhance approach to improve cure clustering using appropriate linkage function for datasets. IJRCCE.

[10] Drias H, Cherif NF, Kechid A. (2016). K-MM: A hybrid clustering algorithm based on k-means and k-medoids. Springer.

[11] Wang HL, Zhou MT. (2012). A refined rough k-means clustering with hybrid threshold. Springer.

[12] Kumar D, Bezdek JC. (2015). A hybrid approach to clustering in big data. IEEE Transactions on Cybernetics.

[13] Fahad A, Alshatri N, Tari Z. (2014). A Survey of clustering algorithms for big data: taxonomy & empirical analysis. IEEE Transactions.