Recognizing Fake Headlines Using Clustering Algorithms


  • Juthuka Arunadevi, A. Mary Sowjanya



The credibility of the news sources has hit a new low during the COVID-19. Hence, it is necessary to check for facts before trusting the news. Clustering is extremely important for analysing data, making predictions, and overcoming data abnormalities. So, in this work, the two most prominent clustering algorithms, K-Means and K-Medoids, are tested on a dataset, and K-Means outperforms k-Medoid. We now utilized supervised classification methods like Logistic Regression, K-Nearest Neighbours, and Support Vector classifier to train on the same news headlines we used for clustering with the 'Prediction' column, and then chosen the technique with the highest accuracy. The Support Vector Classifier had the maximum accuracy of 94.93 percent, according to the test. We have developed is a hybrid model consisting of an unsupervised K Means clustering algorithm and a supervised Support Vector classification algorithm. The K Means algorithm organizes the news headlines into clusters by capturing the usage of certain words and the support vector algorithm learns from those clusters to predict the categories into which the unseen news headlines belong to.




How to Cite

A. Mary Sowjanya, J. A. (2022). Recognizing Fake Headlines Using Clustering Algorithms. Mathematical Statistician and Engineering Applications, 71(2), 111 –.