Using lemmatization technique for automatic diacritics restoration
| dc.contributor.author | Kanis, Jakub | |
| dc.contributor.author | Müller, Luděk | |
| dc.date.accessioned | 2016-01-06T12:43:03Z | |
| dc.date.available | 2016-01-06T12:43:03Z | |
| dc.date.issued | 2005 | |
| dc.description.abstract | Tento článek se zabývá automatickou konstrukcí lematizátoru z Plný tvar - Lema trénovacího slovníku a lematizací nových, v trénovacím slovníku neviděných, tj. OOV slov. Jsou představeny tři metody pro lematizaci tří různých typů OOV slov (chybějící plné tvary, složená a neznámá slova). Nakonec je posána aplikace metody pro automatickou konstrukci lematizátoru na problém obnovení diakritiky. | cs |
| dc.description.abstract-translated | This paper is devoted to automatic construction of a lemmatizer from a Full Form - Lemma (FFL) training dictionary, and to lemmatization of new, in the FFL dictionary unseen - i.e. out-of-vocabulary (OOV), words. Three methods of lemmatization of three kinds of OOV words (missing full forms, unknown words, and compound words) are introduced. In addition, the application of lemmatizer automatic construction to the problem of automatic diacritics restoration is described. | en |
| dc.format | 4 s. | cs |
| dc.format.mimetype | application/pdf | |
| dc.identifier.citation | KANIS, Jakub; MÜLLER, Luděk. Using lemmatization technique for automatic diacritics restoration. In: SPECOM 2005 Proceedings. St. Petersburg: Institute for Informatics and Automation of RAS (SPIIRAS), 2005, p. 255-258. ISBN 5-7452-0110-X. | en |
| dc.identifier.isbn | 5-7452-0110-X | |
| dc.identifier.uri | http://www.kky.zcu.cz/cs/publications/KanisJ_2005_Usinglemmatization | |
| dc.identifier.uri | http://hdl.handle.net/11025/17128 | |
| dc.language.iso | en | en |
| dc.publisher | Moscow state linguistic university | en |
| dc.rights | © Jakub Kanis - Luděk Müller | cs |
| dc.rights.access | openAccess | en |
| dc.subject | lemmatizace | cs |
| dc.subject | OOV slova | cs |
| dc.subject | obnovení diakritiky | cs |
| dc.subject.translated | lemmatization | en |
| dc.subject.translated | OOV words | en |
| dc.subject.translated | diacritics restoration | en |
| dc.title | Using lemmatization technique for automatic diacritics restoration | en |
| dc.title.alternative | Využítí techniky lematizace pro obnovení diakritiky | cs |
| dc.type | článek | cs |
| dc.type | article | en |
| dc.type.status | Peer-reviewed | en |
| dc.type.version | publishedVersion | en |
Files
Original bundle
1 - 1 out of 1 results
No Thumbnail Available
- Name:
- KanisJ_2005_Usinglemmatization.pdf
- Size:
- 69.15 KB
- Format:
- Adobe Portable Document Format
- Description:
- Plný text
License bundle
1 - 1 out of 1 results
No Thumbnail Available
- Name:
- license.txt
- Size:
- 1.71 KB
- Format:
- Item-specific license agreed upon to submission
- Description: