CNN-TDNN-Based Architecture for Speech Recognition Using Grapheme Models in Bilingual Czech-Slovak Task

Psutka, Josef

CNN-TDNN-Based Architecture for Speech Recognition Using Grapheme Models in Bilingual Czech-Slovak Task

dc.contributor.author	Psutka, Josef
dc.contributor.author	Švec, Jan
dc.contributor.author	Pražák, Aleš
dc.date.accessioned	2022-03-28T10:00:27Z
dc.date.available	2022-03-28T10:00:27Z
dc.date.issued	2021
dc.description.abstract-translated	Czech and Slovak languages are very similar, not only in writing but also in phonetic form. This work aims to find a suitable combination of these two languages concerning better recognition results. We would like to show such a contribution on the Malach project. The Malach speech of Holocaust survivors is highly emotional, filled with many disfluencies, heavy accents, age-related coarticulation, and many non-speech events. Due to the nature of the corpus, it is very difficult to find other appropriate data for acoustic modeling, so such a combination can significantly improve the amount of training data. We will discuss the differences between the phoneme and grapheme way of combining Czech with Slovak. We will also compare different architectures of deep neural networks (TDNN, TDNNF, CNN-TDNNF) and tune the optimal topology. The proposed bilingual ASR approach provides a slight improvement over monolingual ASR systems, not only at the phoneme level but also at the grapheme.	en
dc.format	11 s.	cs
dc.format.mimetype	application/pdf
dc.identifier.citation	PSUTKA, J. ŠVEC, J. PRAŽÁK, A. CNN-TDNN-Based Architecture for Speech Recognition Using Grapheme Models in Bilingual Czech-Slovak Task. In Text, Speech, and Dialogue 24th International Conference, TSD 2021, Olomouc, Czech Republic, September 6–9, 2021, Proceedings. Cham: Springer International Publishing, 2021. s. 523-533. ISBN: 978-3-030-83526-2 , ISSN: 0302-9743	cs
dc.identifier.doi	10.1007/978-3-030-83527-9_45
dc.identifier.isbn	978-3-030-83526-2
dc.identifier.issn	0302-9743
dc.identifier.obd	43933412
dc.identifier.uri	2-s2.0-85115207848
dc.identifier.uri	http://hdl.handle.net/11025/47248
dc.language.iso	en	en
dc.project.ID	TN01000024/Národní centrum kompetence - Kybernetika a umělá inteligence	cs
dc.publisher	Springer International Publishing	en
dc.relation.ispartofseries	Text, Speech, and Dialogue 24th International Conference, TSD 2021, Olomouc, Czech Republic, September 6–9, 2021, Proceedings	en
dc.rights	Plný text je přístupný v rámci univerzity přihlášeným uživatelům.	cs
dc.rights	© Springer	en
dc.rights.access	restrictedAccess	en
dc.subject.translated	Speech recognition	en
dc.subject.translated	Multilingual training	en
dc.subject.translated	Robustness	en
dc.subject.translated	Acoustic modeling	en
dc.title	CNN-TDNN-Based Architecture for Speech Recognition Using Grapheme Models in Bilingual Czech-Slovak Task	en
dc.type	konferenční příspěvek	cs
dc.type	ConferenceObject	en
dc.type.status	Peer-reviewed	en
dc.type.version	publishedVersion	en

Files

Original bundle

Showing 1 - 1 out of 1 results

Name:: Psutka2021_Chapter_CNN-TDNN-BasedArchitectureForS.pdf
Size:: 228.83 KB
Format:: Adobe Portable Document Format

Download

Collections

OBD
Conference Papers (KKY)