Czech Speech Synthesis with Generative Neural Vocoder
| dc.contributor.author | Vít, Jakub | |
| dc.contributor.author | Hanzlíček, Zdeněk | |
| dc.contributor.author | Matoušek, Jindřich | |
| dc.date.accessioned | 2020-03-23T11:00:23Z | |
| dc.date.available | 2020-03-23T11:00:23Z | |
| dc.date.issued | 2019 | |
| dc.description.abstract | In recent years, new neural architectures for generating high-quality synthetic speech on a per-sample basis were introduced. We describe our application of statistical parametric speech synthesis based on LSTM neural networks combined with a generative neural vocoder for the Czech language. We used a traditional LSTM architecture for generating vocoder parametrization from linguistic features. We replaced a standard vocoder with a WaveRNN neural network. We conducted a MUSHRA listening test to compare the proposed approach with the unit selection and LSTM-based parametric speech synthesis utilizing a standard vocoder. In contrast with our previous work, we managed to outperform a well-tuned unit selection TTS system by a great margin on both professional and amateur voices. | en |
| dc.format | 9 s. | cs |
| dc.format.mimetype | application/pdf | |
| dc.identifier.citation | VÍT, J., HANZLÍČEK, Z., MATOUŠEK, J. Czech Speech Synthesis with Generative Neural Vocoder. In: Text, Speech, and Dialogue 22nd International Conference, TSD 2019, Ljubljana,Slovenia, September 11-13, 2019, Proceedings. Cham: Springer, 2019. s. 307-315. ISBN 978-3-030-27946-2 , ISSN 0302-9743. | en |
| dc.identifier.doi | 10.1007/978-3-030-27947-9_26 | |
| dc.identifier.isbn | 978-3-030-27946-2 | |
| dc.identifier.issn | 0302-9743 | |
| dc.identifier.obd | 43926904 | |
| dc.identifier.uri | 2-s2.0-85072849542 | |
| dc.identifier.uri | http://hdl.handle.net/11025/36715 | |
| dc.language.iso | en | en |
| dc.project.ID | SGS-2019-027/Inteligentní metody strojového vnímání a porozumění 4 | cs |
| dc.project.ID | GA19-19324S/Plně trénovatelná syntéza české řeči z textu s využitím hlubokých neuronových sítí | cs |
| dc.publisher | Springer | en |
| dc.relation.ispartofseries | Text, Speech, and Dialogue 22nd International Conference, TSD 2019, Ljubljana,Slovenia, September 11-13, 2019, Proceedings | en |
| dc.rights | Plný text není přístupný. | cs |
| dc.rights | © Springer | en |
| dc.rights.access | closedAccess | en |
| dc.subject.translated | Speech synthesis, LSTM-based speech synthesis, WaveRNN, Neural vocoder, Unit selection | en |
| dc.title | Czech Speech Synthesis with Generative Neural Vocoder | en |
| dc.type | konferenční příspěvek | cs |
| dc.type | conferenceObject | en |
| dc.type.status | Peer-reviewed | en |
| dc.type.version | publishedVersion | en |