Study and analysis of document mining Using optimization techniques

Study and analysis of document mining Using optimization techniques

Files

01_title.pdf (28.6 KB)

02_prelim pages.pdf (489.73 KB)

03_content.pdf (14.42 KB)

04_abstract.pdf (9.73 KB)

05_chapter 1.pdf (326.95 KB)

Abstract

In this digital era, the internet acts as an important medium for newlinecommunication. Every day, the internet users generate a vast amount of data in newlineWWW repository for communication. The internet users are contributing data in newlinethe form of text such as emails, tweets, product/movie reviews, discussion text, newlinechat, personal/technical blogs, etc. The quest for knowledge in a vast data pool is newlinea challenging task. The document mining techniques are used to get the needed newlineinformation from the unstructured text corpus in the easiest way. The document newlinemining techniques such as text summarization, topic modeling, text clustering, newlinetext feature selection, text classification, sentiment analysis are used to manage newlineand retrieve the needed information from unstructured text corpus. This research newlinework enhances the document classification techniques and document clustering newlinetechniques by using the Jaya optimization algorithm. This research work is newlinesegmented into two phases. newlineIn the first phase, the proposed research work deploys a novel hybrid newlinefeature selection method based on binary Jaya optimization algorithm to obtain newlinethe appropriate subset of optimal features for document classification problem. newlineFeature selection plays a vital role to reduce the high dimension of the feature newlinespace in the text document classification. The dimension reduction of feature newlinespace reduces the computation cost and improves the text classification newlineaccuracy. Hence, the identification of a proper subset of the significant features newlineof the text corpus is needed to classify the data in less computational time with newlinehigher accuracy. This work introduces the new hybrid feature selection method newlinebased on normalized difference measure and binary Jaya optimization algorithm newlineto obtain the appropriate subset of optimal features from the text corpus. The newlineerror rate is used as a minimizing objective function to measure the fitness of a newlinesolution. The nominated optimal feature subsets are evaluated using Naive newlineBayes and Support Vector Machine classifier with various popular benchmark newlinetext corpus datasets newline

URI

http://hdl.handle.net/10603/455118

Collections

Faculty of Information and Communication Engineering

Full item page

Study and analysis of document mining Using optimization techniques

Files

Date

item.page.authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced