Gelişmiş Arama

Basit öğe kaydını göster

dc.contributor.authorUluskan, Seçkin
dc.contributor.authorSangwan, Abhijeet
dc.contributor.authorHansen, John H. L.
dc.date.accessioned2019-10-18T18:43:46Z
dc.date.available2019-10-18T18:43:46Z
dc.date.issued2017
dc.identifier.issn1381-2416
dc.identifier.issn1572-8110
dc.identifier.urihttps://dx.doi.org/10.1007/s10772-017-9449-6
dc.identifier.urihttps://hdl.handle.net/11421/10415
dc.descriptionWOS: 000414187000006en_US
dc.description.abstractDistant speech capture in lecture halls and auditoriums offers unique challenges in algorithm development for automatic speech recognition. In this study, a new adaptation strategy for distant noisy speech is created by the means of phoneme classes. Unlike previous approaches which adapt the acoustic model to the features, the proposed phoneme-class based feature adaptation (PCBFA) strategy adapts the distant data features to the present acoustic model which was previously trained on close microphone speech. The essence of PCBFA is to create a transformation strategy which makes the distributions of phoneme-classes of distant noisy speech similar to those of a close talk microphone acoustic model in a multidimensional MFCC space. To achieve this task, phoneme-classes of distant noisy speech are recognized via artificial neural networks. PCBFA is the adaptation of features rather than adaptation of acoustic models. The main idea behind PCBFA is illustrated via conventional Gaussian mixture model-Hidden Markov model (GMM-HMM) although it can be extended to new structures in automatic speech recognition (ASR). The new adapted features together with the new and improved acoustic models produced by PCBFA are shown to outperform those created only by acoustic model adaptations for ASR and keyword spotting. PCBFA offers a new powerful understanding in acoustic-modeling of distant speech.en_US
dc.description.sponsorshipAFRL [FA8750-12-1-0188]; University of Texas at Dallas from the Distinguished University Chair in Telecommunications Engineeringen_US
dc.description.sponsorshipThis project was funded by AFRL under contract FA8750-12-1-0188, and partially by the University of Texas at Dallas from the Distinguished University Chair in Telecommunications Engineering held by J.H.L. Hansen.en_US
dc.language.isoengen_US
dc.publisherSpringeren_US
dc.relation.isversionof10.1007/s10772-017-9449-6en_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectPhoneme Classen_US
dc.subjectDistant Noisy Speechen_US
dc.subjectMismatch Acoustic Modelingen_US
dc.subjectFeature Adaptationen_US
dc.titlePhoneme class based feature adaptation for mismatch acoustic modeling and recognition of distant noisy speechen_US
dc.typearticleen_US
dc.relation.journalInternational Journal of Speech Technologyen_US
dc.contributor.departmentAnadolu Üniversitesien_US
dc.identifier.volume20en_US
dc.identifier.issue4en_US
dc.identifier.startpage799en_US
dc.identifier.endpage811en_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US


Bu öğenin dosyaları:

Thumbnail

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

Basit öğe kaydını göster