Unprejudiced Stemming Approach for Disambiguation of Social Media Corpora to Improve the Accuracy of Sentiment Analysis using Machine Learning
Loading...
Date
item.page.authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Big Data Analytics has emerged as a decision-centric approach for organizations
newlineto uncover hidden patterns, correlations, market trends, and customer behavior.
newlineWeb 2.0 textual data is one of the most popular sources of big data, and Web 2.0
newlinetechnologies generate huge social corpora from our daily lives. Natural Language
newlineProcessing plays a vital role in Web 2.0 technology applications such as Internet
newlinebusiness intelligence, reputation management, Sentiment Analysis, and opinion
newlinemining. Natural Language Processing and Machine Learning are subfields of
newlineArtificial Intelligence, which work together to solve big data analytical problems.
newlineNatural Language Processing and Machine Learning can understand and analyze
newlinethe natural language corpora on Social Media Networks and provide actionable
newlinedata intelligence. Sentiment Analysis uses Natural Language Processing and
newlineMachine Learning to extract insights from social corpora of a company, a
newlinebusiness or service organization or government agency to improve the quality of
newlineproducts, customer service, media perceptions, marketing strategies, sales,
newlinecustomer retention, management reputation, trend analysis, new business
newlineopportunities and crises management. According to the sentiment analysis survey, the challenges include bi-polar, NLPoverheads,
newlinedomain dependence, negation, huge lexicon, world knowledge,
newlineextracting features, spam-fake. Among these, the challenges of bipolar and NLPoverheads
newlinehave the least analytical accuracy. Social Big Data contains
newlineHomographs and Morphological ambiguities, which are the root cause of Bipolar
newlineand NLP-overheads. The present research focuses on the data preparation phase
newlineof Sentiment Analysis to improve the accuracy of Sentiment Analysis by disambiguating Homographs and Morphological terms and classifying the
newlineDemographic attributes of the corpora. To disambiguate Homograph terms in social media corpora, we implemented
newlineMachine Learning based on the Homograph Disambiguation algorithm and
newlineachieved the state-of-the-art accuracy levels.