Using LSTM neural networks for cross-lingual phonetic speech segmentation with an iterative correction procedure

Hanzlíček, Zdeněk

Using LSTM neural networks for cross-lingual phonetic speech segmentation with an iterative correction procedure

Files

Computational Intelligence - 2024 - Hanzlíček - Using LSTM neural networks for cross‐lingual phonetic speech segmentation.pdf (3.47 MB)

Date issued

2024

Authors

Hanzlíček, Zdeněk

Matoušek, Jindřich

Vít, Jakub

Abstract

This article describes experiments on speech segmentation using long short-term memory recurrent neural networks. The main part of the paper deals with multi-lingual and cross-lingual segmentation, that is, it is performed on a language different from the one on which the model was trained. The experimental data involves large Czech, English, German, and Russian speech corpora designated for speech synthesis. For optimal multi-lingual modeling, a compact phonetic alphabet was proposed by sharing and clustering phones of particular languages. Many experiments were performed exploring various experimental conditions and data combinations. We proposed a simple procedure that iteratively adapts the inaccurate default model to the new voice/language. The segmentation accuracy was evaluated by comparison with reference segmentation created by a well-tuned hidden Markov model-based framework with additional manual corrections. The resulting segmentation was also employed in a unit selection text-to-speech system. The generated speech quality was compared with the reference segmentation by a preference listening test.

Subject(s)

LSTM neural networks, multi-lingual and cross-lingual modeling, speech segmentation

Item identifier

http://hdl.handle.net/11025/61293
https://doi.org/10.1111/coin.12602

Collections

Articles (KKY)

Show full item record

Using LSTM neural networks for cross-lingual phonetic speech segmentation with an iterative correction procedure

Files

Date issued

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Subject(s)

Citation

Item identifier

Collections