Pattern based bootstrapping approaches For natural language processing of Morphologically rich languages

Pattern based bootstrapping approaches For natural language processing of Morphologically rich languages

Files

01_title.pdf (51.66 KB)

02_certificate.pdf (1.25 MB)

03_abstract.pdf (33.99 KB)

04_acknowledgement.pdf (21.53 KB)

05_content.pdf (67.8 KB)

Date

2015-02-17

Abstract

This thesis attempts to tackle Natural Language Processing NLP tasks by exploiting the special characteristics of morphologically rich newlineLanguages In this thesis we use Tamil as an example to show how newlinecomputational approaches to such morphologically rich languages need to be newlinedifferent Our initial work used the special characteristics to build rule based newlinesystems However as is the case with most rule based systems only the newlinenatural language sentences of a specific domain could be tackled As a result newlineof our experience in building the rule based systems we were able to identify newlinethe linguistic features that could be effectively used for the NLP processing of newlinemorphologically rich languages newlineIn order to overcome the limitations of rule based approaches we newlinenext attempted to explore machine learning approaches One of the common newlinemachine learning approaches used for languages such as English, is newlinesupervised learning Supervised approaches require a large labor intensive newlineannotated and labeled corpus which is not available for resource scarce newlinelanguages such as Tamil Unsupervised approaches on the other hand take a newlinelong time to converge to a solution We first attempted an unsupervised approach newlinefor the semantic relation extraction From our experience with the unsupervised newlineapproach we found that the partially free word order characteristic of a newlinemorphologically rich language did not lend itself to fast convergence to a newlinesolution In this context we decided that semi supervised approaches that require a limited number of trained samples could be attempted newline newline

URI

http://hdl.handle.net/10603/35514

Collections

Faculty of Information and Communication Engineering

Full item page

Pattern based bootstrapping approaches For natural language processing of Morphologically rich languages

Files

Date

item.page.authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced