Sentence Level Continuous Sign Language Recognition and Translation using Transformers
Loading...
Date
item.page.authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
ABSTRACT
newlineSign Language (SL) is a visual language used by millions of deaf, hard-of-hearing
newline(HoH), and speech disability individuals. According to the World Federation of the
newlineDeaf (WFD), around 70 million people use SL (worldwide, 300 different SLs are estimated
newlineby WFD) [1]. HoH individuals often face communication barriers that hinder
newlinesocial integration, leading to issues like loneliness, isolation, and frustration. These
newlinechallenges can result in emotional and mental health problems, low confidence, academic
newlinestruggles, and increased unemployment, reflecting the broader difficulties they
newlineendure. Due to audism, HoH individuals are often overlooked and compelled to use
newlinealternative means of communication. The consequences are that, to communicate effectively,
newlineHoH individuals need to write down or type messages, or they are compelled
newlineto use special gloves [2], which can cause discomfort for HoH individuals. SL is an effective
newlinetechnique to address the problems HoH people face and enable communication
newlinebetween two signers, removing the impediment posed by verbal language. However,
newlineit fails to solve the issue when
newline(i) Everyone uses a different SL to communicate
newline(ii) Someone is uncommunicative through signs, creating a broad barrier among
newlinesigners and between signers and non-signers.
newlineTransformer architectures have accelerated the research in Continuous Sign Language
newlineRecognition and Translation (CSLRT), which involves predicting sign gloss
newlinepatterns from videos and converting them into spoken language. This process is challenging
newlinedue to the lack of direct alignment between sign glosses and spoken words.
newlineWhile Transformers are effective due to their ability to process inputs in parallel,
newlinetheir high memory consumption makes them less suitable for edge devices. The sign
newlinelanguage translation models must be usable by signers on edge devices such as mobile
newlinephones, providing a computationally efficient, cost-effective, and enhanced CSLRT
newlinesystem.
newlineResearchers have sufficiently explored isolated sign recognition for alphabets, digits
newlinea