Automatic Evaluation of Machine Translation Outputs
Loading...
Date
item.page.authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Human knowledge is predominantly encoded in natural language. Automatic understanding
newlineof natural language has been a most awaited goal of AI/ML. Among several challenges of natural
newlinelanguage processing and understanding, Machine Translation (MT) stands to be a major
newlinetask. MT is the process of converting the source text in one natural language to another natural
newlinelanguage. With the advent of neural models, the field of neural MT (NMT) has made significant
newlineprogress, yet the challenge of effectively evaluating the translation quality remains critical.
newlineThis thesis focuses on the automatic evaluation of machine translation outputs, addressing both
newlinethe limitations of existing metrics and proposing innovative approaches to enhance evaluation
newlineaccuracy. Existing widely used metrics, such as BLEU and METEOR, are scrutinized for their
newlinereliance on n-gram overlap and limited linguistic insight, often failing to capture nuances of
newlinelanguage such as fluency, adequacy, and semantic equivalence. To mitigate these shortcomings,
newlinethis research explores advanced evaluation methods leveraging pre-trained multilingual
newlineembeddings and deep learning, which model linguistic phenomena more comprehensively.
newlineThis thesis introduces several novel automatic evaluation approaches (unsupervised referencebased,
newlineunsupervised reference-free and supervised approaches) that integrates semantic similarity
newlineand contextual relevance by employing pre-trained language models. Through extensive
newlineexperimentation, our metrics are benchmarked against the traditional ones, demonstrating superior
newlineperformance in correlating with human judgments. Additionally, this work investigates
newlinethe existing evaluation datasets and evaluation approaches, discusses their limitations, and also
newlineprovides decision trees to help the researchers choose particular evaluation criteria or a metric
newlinebased on available computation and linguistic resources.
newlineUltimately, this thesis contributes to the field by offering robust methodologies for MT output
newlineevaluation, emphasizing the need for metrics