Text representations and evolutionary based Intelligent information retrieval
Loading...
Date
item.page.authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Abstract
newlineThe rapid adoption of digitization in recent years has made it possible to easily access
newlinea large number of documents; however, digging through such voluminous data to
newlinedeliver timely and relevant information to user-specific needs for decision-making is
newlinea complex and time-consuming process. The exponential surge in knowledge sources
newlinehas intensified the difficulties in understanding the meaning of information in order to
newlineretrieve appropriate outcomes from large datasets. In this thesis, we re-address an information
newlineretrieval task using recent technological breakthroughs. Therefore, we propose
newlinethree different IR frameworks to handle query and document representation for relevant
newlineinformation retrieval. The research has investigated the existing state-of-the-art
newlinemethodologies in the domain of information retrieval and proposed a swarm optimized
newlinecluster-based framework, a phrase embedding-based query expansion framework, and a
newlinetransformer-based deep semantic representation framework to effectively and efficiently
newlineretrieve information.
newlineThe decomposition of large datasets into small groups enables systems to gain a deeper
newlineunderstanding of the context of information in less time. This emphasizes the need of
newlinetopical search strategies that perform data categorization for searching. Unsupervised
newlinetechniques such as clustering can be used to perform the decomposition of unstructured
newlineand unlabeled data when the context is unclear. Dividing large datasets into small
newlinegroups allows for faster and more accurate access to information. Therefore, we propose
newlinea swarm optimized cluster-based framework with frequent pattern mining techniques to
newlineretrieve user-specific knowledge from the extensive document collections. The preprocessing
newlinetask is divided into two sub-tasks namely, document clustering and frequent
newlinepattern mining. The first applies the proposed bio-inspired K-Flock clustering algorithm
newlineto decompose document collection into small groups, and the second extract frequent
newlinepatterns from the decomposed groups using a memory