Fake News Detection Using Topic Modelling Coupled With Deep Learning Techniques
Loading...
Date
item.page.authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
newline
newlineABSTRACT
newlineFake news propagated through online social media platforms such as Twitter,
newlineFacebook etc. has become increasingly rampant and harmful affecting several areas of
newlinelife including communities, election outcomes and even the public opinion. Researchers
newlineall over the world have been trying to combat fake news detection through rapid
newlineidentification and mitigating its further spread to either avoid or limit the expected
newlinedamage. Most of the existing methods made use of natural language processing (NLP),
newlinemachine learning, Deep learning etc.
newline
newlineIn the present study, four different models have been developed for fake news
newlinedetection, making use of natural language processing, supervised and unsupervised
newlinemachine learning, Topic Modelling and Optimization.
newlineModel 1: Supervised Hybrid model KNN with PCA
newlineIn this model, data extracted through web scraping is preprocessed to transform
newlineraw data into a usable format through different steps like cleaning, removal of common
newlinewords, stemming, lemmatization and tokenization. TF_IDF is used for calculating word
newlinefrequencies to identify which words are more significant. K-Nearest Neighbors (KNN),
newlinePrincipal Component Analysis (PCA) and KNN+PCA hybrid classifiers were used to
newlineidentify fake news. Of these three KNN+PCA hybrid model showed higher accuracy.
newline Most of the reported methods for fake news detection are based on supervised
newlinelearning which consume more time, considerable human involvement for pre-annotation
newlineof data sets to train classification models. Hence, compared to supervised learning,
newlinedevelopment of fake news detection methods based on unsupervised learning will be
newlinemore useful in terms of low cost involved along with their user friendly nature. Hence,
newlineefforts have been made to develop techniques based on unsupervised models.
newlineModel 2: Unsupervised Hybrid model K-means with SVM
newline
newlineSince real data do not carry labelled information, unsupervised techniques k-
newlinemeans and k-medoids were used. TF_IDF is used for feature extraction and PCA is used
newlinexiv
newline
newlineAn