T5G2P: Multilingual Grapheme-to-Phoneme Conversion with Text-to-Text Transfer Transformer

Abstract

In recent years, the Text-to-Text Transfer Transformer (T5) neural network has proved more powerful for many text-related tasks, including the grapheme-to-phoneme conversion (G2P). The paper describes the training process of T5-base models for several languages. It shows the advantages of training G2P models using that language-specific basis over the G2P models fine-tuned from the multilingual base model. The paper also explains the reasons for training G2P models on whole sentences (not a dictionary) and evaluates the trained G2P models on unseen sentences and words.

Description

Subject(s)

T5 \and transformers, phonetic transcription, grapheme-to-phoneme, TTS system

Citation