Unsupervised methods for language modeling: technical report no. DCSE/TR-2012-03

dc.contributor.authorBrychcín, Tomáš
dc.date.accessioned2016-06-21T06:45:43Z
dc.date.available2016-06-21T06:45:43Z
dc.date.issued2012
dc.description.abstract-translatedLanguage models are crucial for many tasks in NLP and N-grams are the best way to build them. Huge e ort is being invested in improving n-gram language models. By introducing external information (morphology, syntax, partitioning into documents, etc.) into the models a signi cant improvement can be achieved. The models can however be improved with no external information and smoothing is an excellent example of such an improvement. Thesis summarizes the state-of-the-art approaches to unsupervised language modeling with emphases on the in ectional languages, which are particularly hard to model. It is focused on methods that can discover hidden patterns that are already in a training corpora. These patterns can be very useful for enhancing the performance of language modeling, moreover they do not require additional information sources.en
dc.format10 s.cs
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp:// www.kiv.zcu.cz/publications/
dc.identifier.urihttp://hdl.handle.net/11025/21549
dc.language.isoenen
dc.publisherUniversity of West Bohemia in Pilsenen
dc.rights© University of West Bohemia in Pilsenen
dc.rights.accessopenAccessen
dc.subjectjazykový modelcs
dc.subjectn-gramcs
dc.subject.translatedlanguage modelen
dc.subject.translatedn-gramen
dc.titleUnsupervised methods for language modeling: technical report no. DCSE/TR-2012-03en
dc.typezprávacs
dc.typereporten
dc.type.versionpublishedVersionen

Files

Original bundle
Showing 1 - 1 out of 1 results
No Thumbnail Available
Name:
Brychcin.pdf
Size:
425.44 KB
Format:
Adobe Portable Document Format
Description:
Plný text
License bundle
Showing 1 - 1 out of 1 results
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections