Fake News Detection Using Topic Modelling Coupled With Deep Learning Techniques

Abstract

newline newlineABSTRACT newlineFake news propagated through online social media platforms such as Twitter, newlineFacebook etc. has become increasingly rampant and harmful affecting several areas of newlinelife including communities, election outcomes and even the public opinion. Researchers newlineall over the world have been trying to combat fake news detection through rapid newlineidentification and mitigating its further spread to either avoid or limit the expected newlinedamage. Most of the existing methods made use of natural language processing (NLP), newlinemachine learning, Deep learning etc. newline newlineIn the present study, four different models have been developed for fake news newlinedetection, making use of natural language processing, supervised and unsupervised newlinemachine learning, Topic Modelling and Optimization. newlineModel 1: Supervised Hybrid model KNN with PCA newlineIn this model, data extracted through web scraping is preprocessed to transform newlineraw data into a usable format through different steps like cleaning, removal of common newlinewords, stemming, lemmatization and tokenization. TF_IDF is used for calculating word newlinefrequencies to identify which words are more significant. K-Nearest Neighbors (KNN), newlinePrincipal Component Analysis (PCA) and KNN+PCA hybrid classifiers were used to newlineidentify fake news. Of these three KNN+PCA hybrid model showed higher accuracy. newline Most of the reported methods for fake news detection are based on supervised newlinelearning which consume more time, considerable human involvement for pre-annotation newlineof data sets to train classification models. Hence, compared to supervised learning, newlinedevelopment of fake news detection methods based on unsupervised learning will be newlinemore useful in terms of low cost involved along with their user friendly nature. Hence, newlineefforts have been made to develop techniques based on unsupervised models. newlineModel 2: Unsupervised Hybrid model K-means with SVM newline newlineSince real data do not carry labelled information, unsupervised techniques k- newlinemeans and k-medoids were used. TF_IDF is used for feature extraction and PCA is used newlinexiv newline newlineAn

Description

Keywords

Citation

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced