Neural Speech Synthesis with Enriched Phrase Boundaries
| dc.contributor.author | Kunešová, Marie | |
| dc.contributor.author | Matoušek, Jindřich | |
| dc.date.accessioned | 2025-06-20T08:55:02Z | |
| dc.date.available | 2025-06-20T08:55:02Z | |
| dc.date.issued | 2023 | |
| dc.date.updated | 2025-06-20T08:55:02Z | |
| dc.description.abstract | Prosodic phrasing is one of the factors influencing the naturalness of synthesized speech. In this paper, we enrich the phonetic representation for neural speech synthesis with additional markers denoting the strength of phrase breaks between words. These markers are assigned to the training data automatically, using our previously introduced model for audio-based phrase boundary detection. We tested the approach with two different levels of resolution for the break indices-either ten distinct levels (P10) or only “ToBI-like” four levels (P4). Listening tests with two different speaker voices show a statistically significant preference among listeners for P10 or P4 over the baseline speech synthesis without these markers (P0), although which version is judged as better depends on the voice. | en |
| dc.format | 5 | |
| dc.identifier.doi | 10.21437/Interspeech.2023-1552 | |
| dc.identifier.isbn | neuvedeno | |
| dc.identifier.issn | 2308-457X | |
| dc.identifier.obd | 43940124 | |
| dc.identifier.orcid | Kunešová, Marie 0000-0002-7187-8481 | |
| dc.identifier.orcid | Matoušek, Jindřich 0000-0002-7408-7730 | |
| dc.identifier.uri | http://hdl.handle.net/11025/61540 | |
| dc.language.iso | en | |
| dc.project.ID | GA21-14758S | |
| dc.project.ID | 90140 | |
| dc.project.ID | 90104 | |
| dc.publisher | International Speech Communication Association | |
| dc.relation.ispartofseries | INTERSPEECH 2023 | |
| dc.subject | speech synthesis | en |
| dc.subject | phrasing | en |
| dc.subject | phrase breaks | en |
| dc.subject | wav2vec | en |
| dc.title | Neural Speech Synthesis with Enriched Phrase Boundaries | en |
| dc.type | Stať ve sborníku (D) | |
| dc.type | STAŤ VE SBORNÍKU | |
| dc.type.status | Published Version | |
| local.files.count | 1 | * |
| local.files.size | 308749 | * |
| local.has.files | yes | * |
| local.identifier.eid | 2-s2.0-85171567961 |
Files
Original bundle
1 - 1 out of 1 results
No Thumbnail Available
- Name:
- kunesova23_interspeech.pdf
- Size:
- 301.51 KB
- Format:
- Adobe Portable Document Format
License bundle
1 - 1 out of 1 results
No Thumbnail Available
- Name:
- license.txt
- Size:
- 1.71 KB
- Format:
- Item-specific license agreed upon to submission
- Description: