Sentence Level Continuous Sign Language Recognition and Translation using Transformers

Abstract

ABSTRACT newlineSign Language (SL) is a visual language used by millions of deaf, hard-of-hearing newline(HoH), and speech disability individuals. According to the World Federation of the newlineDeaf (WFD), around 70 million people use SL (worldwide, 300 different SLs are estimated newlineby WFD) [1]. HoH individuals often face communication barriers that hinder newlinesocial integration, leading to issues like loneliness, isolation, and frustration. These newlinechallenges can result in emotional and mental health problems, low confidence, academic newlinestruggles, and increased unemployment, reflecting the broader difficulties they newlineendure. Due to audism, HoH individuals are often overlooked and compelled to use newlinealternative means of communication. The consequences are that, to communicate effectively, newlineHoH individuals need to write down or type messages, or they are compelled newlineto use special gloves [2], which can cause discomfort for HoH individuals. SL is an effective newlinetechnique to address the problems HoH people face and enable communication newlinebetween two signers, removing the impediment posed by verbal language. However, newlineit fails to solve the issue when newline(i) Everyone uses a different SL to communicate newline(ii) Someone is uncommunicative through signs, creating a broad barrier among newlinesigners and between signers and non-signers. newlineTransformer architectures have accelerated the research in Continuous Sign Language newlineRecognition and Translation (CSLRT), which involves predicting sign gloss newlinepatterns from videos and converting them into spoken language. This process is challenging newlinedue to the lack of direct alignment between sign glosses and spoken words. newlineWhile Transformers are effective due to their ability to process inputs in parallel, newlinetheir high memory consumption makes them less suitable for edge devices. The sign newlinelanguage translation models must be usable by signers on edge devices such as mobile newlinephones, providing a computationally efficient, cost-effective, and enhanced CSLRT newlinesystem. newlineResearchers have sufficiently explored isolated sign recognition for alphabets, digits newlinea

Description

Keywords

Citation

Collections

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced