Efficient Unsupervised Learning Technique Based Automatic Text Categorization

Abstract

Automatic text categorization can play an important role in a wide variety of more newlinetlexible. dynamic and personalized information management tasks such as real-time newlinesorting of email or files into folder hierarchies; topic identification to support topicspecific newlineprocessing operations; structured search and/or browsing; or finding documents newlinethat match long-term standing interests or more dynamic task-based interests. newlineIn many contexts, textual information is a more important communication data in newlineWorld Wide Web which is employed to categorize new knowledge by the trained newlineprofessionals. This process is very time consuming and costly, thus limiting its newlineapplicability. Consequently there is increased interest in developing technologies for newlineautomatic text categorization. newlineThe main focus of this research work is to study the problem of automatic newlinetext categorization and to develop efficient unsupervised learning technique based text newlinecategorization mechanism. In this thesis, an attempt is made to overcome the challenges newlineof the various classifiers in terms of learning speed, real-time classification speed, and newlineaccuracy. Three new algorithms are implements and results are analyzed to see the newlineperformance of these algorithms using two different types of datasets DS0 and DS1 (20- newlineNewsgroups, and Reuters-21578 WebPages). The performance evaluations of the newlineproposed algorithms are done on different combinations of classifiers (Naïve Bayes and newlineJ48) and datasets (DS0 and DS1). newlineThe first algorithm describes a novel unsupervised learning based approach newlinewhich uses frequent item (term) sets for text clustering for reducing drastically the newlinedimensionality of the data. All the way through the performance analysis, it provides newlinehetter accuracy of classilication as compared to direct classification.

Description

Keywords

Citation

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced