Automatic Evaluation of Machine Translation Outputs

Abstract

Human knowledge is predominantly encoded in natural language. Automatic understanding newlineof natural language has been a most awaited goal of AI/ML. Among several challenges of natural newlinelanguage processing and understanding, Machine Translation (MT) stands to be a major newlinetask. MT is the process of converting the source text in one natural language to another natural newlinelanguage. With the advent of neural models, the field of neural MT (NMT) has made significant newlineprogress, yet the challenge of effectively evaluating the translation quality remains critical. newlineThis thesis focuses on the automatic evaluation of machine translation outputs, addressing both newlinethe limitations of existing metrics and proposing innovative approaches to enhance evaluation newlineaccuracy. Existing widely used metrics, such as BLEU and METEOR, are scrutinized for their newlinereliance on n-gram overlap and limited linguistic insight, often failing to capture nuances of newlinelanguage such as fluency, adequacy, and semantic equivalence. To mitigate these shortcomings, newlinethis research explores advanced evaluation methods leveraging pre-trained multilingual newlineembeddings and deep learning, which model linguistic phenomena more comprehensively. newlineThis thesis introduces several novel automatic evaluation approaches (unsupervised referencebased, newlineunsupervised reference-free and supervised approaches) that integrates semantic similarity newlineand contextual relevance by employing pre-trained language models. Through extensive newlineexperimentation, our metrics are benchmarked against the traditional ones, demonstrating superior newlineperformance in correlating with human judgments. Additionally, this work investigates newlinethe existing evaluation datasets and evaluation approaches, discusses their limitations, and also newlineprovides decision trees to help the researchers choose particular evaluation criteria or a metric newlinebased on available computation and linguistic resources. newlineUltimately, this thesis contributes to the field by offering robust methodologies for MT output newlineevaluation, emphasizing the need for metrics

Description

Keywords

Citation

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced