Using Pre-trained Models for Phoneme Representation in Czech Speech Synthesis

Vladař, Lukáš

Using Pre-trained Models for Phoneme Representation in Czech Speech Synthesis

Files

SGS_2025_011_ID 43948781_Vladař_SVK.pdf (1.19 MB)

Date issued

2025

Authors

Vladař, Lukáš

Publisher

Západočeská univerzita v Plzni

Abstract

Text-to-speech (TTS) systems, i.e., systems producing artificial speech, represent an importanttopic in the field of artificial intelligence. Modern approaches based on neural networksreach very good results, almost comparable to real human speech.Nguyen et al. (2023) argue that including a large-scale pre-trained model for phonemerepresentation in a neural TTS system can further improve the final synthetic speech. We usedtheir pre-trained model called XPhoneBERT to investigate whether it can also enhance the qualityof speech synthesis in the Czech language.

Subject(s)

phoneme representation, Czech speech, synthesis

Item identifier

http://hdl.handle.net/11025/67456

Collections

Conference papers (NTIS)

Show full item record

Using Pre-trained Models for Phoneme Representation in Czech Speech Synthesis

Files

Date issued

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Subject(s)

Citation

Item identifier

Collections