T5G2P: Text-to-Text Transfer Transformer Based Grapheme-to-Phoneme Conversion

Řezáčková, Markéta

T5G2P: Text-to-Text Transfer Transformer Based Grapheme-to-Phoneme Conversion

Files

T5G2P_Text-to-Text_Transfer_Transformer_Based_Grapheme-to-Phoneme_Conversion.pdf (969.61 KB)

Date issued

2024

Authors

Řezáčková, Markéta

Tihelka, Daniel

Matoušek, Jindřich

Abstract

The present paper explores the use of several deep neural network architectures to carry out a grapheme-to-phoneme (G2P) conversion, aiming to find a universal and language-independent approach to the task. The models explored are trained on whole sentences in order to automatically capture cross-word context (such as voicedness assimilation) if it exists in the given language. Four different languages, English, Czech, Russian, and German, were chosen due to their different nature and requirements for the G2P task. Ultimately, the Text-to-Text Transfer Transformer (T5) based model achieved very high conversion accuracy on all the tested languages. Also, it exceeded the accuracy reached by a similar system, when trained on a public LibriSpeech database.

Subject(s)

CNN, Czech, English, G2P, German, phonetic transcription, RNN, Russian, T5

Item identifier

http://hdl.handle.net/11025/61004
https://doi.org/10.1109/TASLP.2024.3426332

Collections

Articles (NTIS)

Show full item record

T5G2P: Text-to-Text Transfer Transformer Based Grapheme-to-Phoneme Conversion

Files

Date issued

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Subject(s)

Citation

Item identifier

Collections