Dialect Classification and Multi Dialect Speech Recognition

Loading...
Thumbnail Image

Date

item.page.authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Keywords: dialect classification; zero-time windowing; single frequency filtering; frequency domain newlinelinear prediction; convolution neural network; ECAPA-TDNN; deepspeech; multi-dialect automatic speech newlinerecognition; Indian English ASR newlineMajor goal of this thesis is to study the dialectal variations and improve the performance of speech newlinerecognition with an embeddings derived from improved dialect classification system. Initial studies focused newlineon improvement of dialect classification system with three major dialects (AU:Australian, UK:Britain, and newlineUS:American) of English. newlineIn order to improve the performance of dialect classification system and based on the analysis of dialectal newlinevariations, advanced signal processing approaches were proposed to investigate for dialect classification newlinewith traditional i-vector system. The features that provide high spectral resolution will help to capture newlinesubtle differences between dialects. So, this thesis proposed to use single frequency filtering (SFF) and newlinezero-time windowing (ZTW) based features that provide high spectral resolution without compromising newlinetemporal resolution. Along with frame level spectral resolution, longer temporal context will constitute newlinefor dialect classification. So, approaches that enhance the temporal context of proposed features (SFF and newlineZTW) approaches such as delta and double delta coefficients (and#916;+and#916;and#916;), shifted delta coefficients (SDCs) newlineare experimented. It is observed that dialect classification system has given promising performance with newlinethe proposed features with temporal context provided by and#916;+and#916;and#916; and SDCs. Further, signal processing newlineapproaches that can provide long temporal summarization such as frequency domain linear prediction newline(FDLP) are proposed for dialect classification. From experiments, with FDLP based features, it is observed newlinethat long temporal summarization provided by FDLP based features is advantageous for discriminating newlinedialects. So, both the signal processing approaches that provide high spectral resolution (SFF and ZTW) and newlinelong temporal sum

Description

Keywords

Citation

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced