Transformer-Based Automatic Punctuation Prediction and Word Casing Reconstruction of the ASR Output
| dc.contributor.author | Švec, Jan | |
| dc.contributor.author | Lehečka, Jan | |
| dc.contributor.author | Šmídl, Luboš | |
| dc.contributor.author | Ircing, Pavel | |
| dc.date.accessioned | 2022-03-28T10:00:27Z | |
| dc.date.available | 2022-03-28T10:00:27Z | |
| dc.date.issued | 2021 | |
| dc.description.abstract-translated | The paper proposes a module for automatic punctuation prediction and casing reconstruction based on transformers architectures (BERT/T5) that constitutes the current state-of-the-art in many similar NLP tasks. The main motivation for our work was to increase the readability of the ASR output. The ASR output is usually in the form of a continuous stream of text, without punctuation marks and with all words in lowercase. The resulting punctuation and casing reconstruction module is evaluated on both the written text and the actual ASR output in three languages (English, Czech and Slovak). | en |
| dc.format | 9 s. | cs |
| dc.format.mimetype | application/pdf | |
| dc.identifier.citation | ŠVEC, J. LEHEČKA, J. ŠMÍDL, L. IRCING, P. Transformer-Based Automatic Punctuation Prediction and Word Casing Reconstruction of the ASR Output. In Text, Speech, and Dialogue 24th International Conference, TSD 2021, Olomouc, Czech Republic, September 6–9, 2021, Proceedings. Cham: Springer International Publishing, 2021. s. 86-94. ISBN: 978-3-030-83526-2 , ISSN: 0302-9743 | cs |
| dc.identifier.doi | 10.1007/978-3-030-83527-9_7 | |
| dc.identifier.isbn | 978-3-030-83526-2 | |
| dc.identifier.issn | 0302-9743 | |
| dc.identifier.obd | 43933408 | |
| dc.identifier.uri | 2-s2.0-85115216462 | |
| dc.identifier.uri | http://hdl.handle.net/11025/47244 | |
| dc.language.iso | en | en |
| dc.project.ID | TN01000024/Národní centrum kompetence - Kybernetika a umělá inteligence | cs |
| dc.project.ID | 90140/Velká výzkumná infrastruktura_(J) - e-INFRA CZ | cs |
| dc.publisher | Springer International Publishing | en |
| dc.relation.ispartofseries | Text, Speech, and Dialogue 24th International Conference, TSD 2021, Olomouc, Czech Republic, September 6–9, 2021, Proceedings | en |
| dc.rights | Plný text je přístupný v rámci univerzity přihlášeným uživatelům. | cs |
| dc.rights | © Springer | en |
| dc.rights.access | restrictedAccess | en |
| dc.subject.translated | ASR | en |
| dc.subject.translated | BERT | en |
| dc.subject.translated | T5 | en |
| dc.subject.translated | Punctuation predictor | en |
| dc.subject.translated | Word casing reconstruction | en |
| dc.title | Transformer-Based Automatic Punctuation Prediction and Word Casing Reconstruction of the ASR Output | en |
| dc.type | konferenční příspěvek | cs |
| dc.type | ConferenceObject | en |
| dc.type.status | Peer-reviewed | en |
| dc.type.version | publishedVersion | en |
Files
Original bundle
1 - 1 out of 1 results
No Thumbnail Available
- Name:
- Svec_Transformer-BasedAutomatic_TSD2021.pdf
- Size:
- 10.05 MB
- Format:
- Adobe Portable Document Format