A Comparative Analysis of Bilingual and Trilingual Wav2Vec Models for Automatic Speech Recognition in Multilingual Oral History Archives

dc.contributor.authorLehečka, Jan
dc.contributor.authorPsutka, Josef
dc.contributor.authorŠmídl, Luboš
dc.contributor.authorIrcing, Pavel
dc.contributor.authorPsutka, Josef
dc.date.accessioned2025-06-20T08:35:46Z
dc.date.available2025-06-20T08:35:46Z
dc.date.issued2024
dc.date.updated2025-06-20T08:35:46Z
dc.description.abstractIn this paper, we are comparing monolingual Wav2Vec 2.0 models with various multilingual models to see whether we could improve speech recognition performance on a unique oral history archive containing a lot of mixed-language sentences. Our main goal is to push forward research on this unique dataset, which is an extremely valuable part of our cultural heritage. Our results suggest that monolingual speech recognition models are, in most cases, superior to multilingual models, even when processing the oral history archive full of mixed-language sentences from non-native speakers. We also performed the same experiments on the public CommonVoice dataset to verify our results. We are contributing to the research community by releasing our pre-trained models to the public.en
dc.format5
dc.identifier.document-number001331850101086
dc.identifier.doi10.21437/Interspeech.2024-472
dc.identifier.isbnneuvedeno
dc.identifier.issn2308-457X
dc.identifier.obd43944107
dc.identifier.orcidLehečka, Jan 0000-0002-3889-8069
dc.identifier.orcidPsutka, Josef 0000-0003-4761-1645
dc.identifier.orcidŠmídl, Luboš 0000-0002-8169-2410
dc.identifier.orcidIrcing, Pavel 0000-0001-6967-1687
dc.identifier.orcidPsutka, Josef 0000-0002-0764-3207
dc.identifier.urihttp://hdl.handle.net/11025/60312
dc.language.isoen
dc.project.ID90254
dc.project.IDVJ01010108
dc.publisherInternational Speech Communication Association (ISCA)
dc.relation.ispartofseries25th Interspeech Conference 2024
dc.subjectspeech recognitionen
dc.subjectbilingual modelsen
dc.subjecttrilingual modelsen
dc.subjectoral history archivesen
dc.titleA Comparative Analysis of Bilingual and Trilingual Wav2Vec Models for Automatic Speech Recognition in Multilingual Oral History Archivesen
dc.typeStať ve sborníku (D)
dc.typeSTAŤ VE SBORNÍKU
dc.type.statusPublished Version
local.files.count1*
local.files.size238126*
local.has.filesyes*
local.identifier.eid2-s2.0-85208505026

Files

Original bundle
Showing 1 - 1 out of 1 results
No Thumbnail Available
Name:
lehecka24_interspeech.pdf
Size:
232.54 KB
Format:
Adobe Portable Document Format
License bundle
Showing 1 - 1 out of 1 results
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: