System for fast lexical and phonetic spoken term detection in a czech cultural heritage archive
| dc.contributor.author | Psutka, Josef | |
| dc.contributor.author | Švec, Jan | |
| dc.contributor.author | Psutka, Josef V. | |
| dc.contributor.author | Vaněk, Jan | |
| dc.contributor.author | Pražák, Aleš | |
| dc.contributor.author | Šmídl, Aleš | |
| dc.contributor.author | Ircing, Pavel | |
| dc.date.accessioned | 2016-01-06T13:40:32Z | |
| dc.date.available | 2016-01-06T13:40:32Z | |
| dc.date.issued | 2011 | |
| dc.description.abstract-translated | The main objective of the work presented in this paper was to develop a complete system that would accomplish the original visions of the MALACH project. Those goals were to employ automatic speech recognition and information retrieval techniques to provide improved access to the large video archive containing recorded testimonies of the Holocaust survivors. The system has been so far developed for the Czech part of the archive only. It takes advantage of the state-of-the art speech recognition system tailored to the challenging properties of the recordings in the archive (elderly speakers, spontaneous speech, emotionally loaded content) and its close coupling with the actual search engine. The design of the algorithm adopting the spoken term detection approach is focused on the speed of the retrieval. The resulting system is able to search through the 1,000 hours of video constituting the Czech portion of the archive and find query word occurrences in the matter of seconds. The phonetic search implemented alongside the search based on the lexicon words allows to find even the words outside the ASR system lexicon such as names, geographic locations or Jewish slang. | en |
| dc.format | 14 s. | cs |
| dc.format.mimetype | application/pdf | |
| dc.identifier.citation | PSUTKA, Josef; ŠVEC, Jan; PSUTKA, Josef V.; VANĚK, Jan; PRAŽÁK, Aleš; ŠMÍDL, Aleš. System for fast lexical and phonetic spoken term detection in a czech cultural heritage archive. In: EURASIP Journal on audio, speech, and music processing, 2011, 10, p. 1-11. ISSN 1687-4714. | en |
| dc.identifier.issn | 1687-4714 | |
| dc.identifier.uri | http://www.kky.zcu.cz/cs/publications/JosefPsutka_SystemforFast | |
| dc.identifier.uri | http://hdl.handle.net/11025/17134 | |
| dc.language.iso | en | en |
| dc.publisher | Springer | en |
| dc.rights | © Josef Psutka - Jan Švec - Josef V. Psutka - Jan Vaněk - Aleš Pražák - Luboš Šmíd - Pavel Ircing | cs |
| dc.rights.access | openAccess | en |
| dc.subject | Malach | cs |
| dc.subject | automatické rozpoznávání řeči | cs |
| dc.subject | video | cs |
| dc.subject.translated | Malach | en |
| dc.subject.translated | automatic speech recognition | en |
| dc.subject.translated | video | en |
| dc.title | System for fast lexical and phonetic spoken term detection in a czech cultural heritage archive | en |
| dc.title.alternative | Systém pro rychlé lexikální a fonetické vyhledávání mluvených frází v archívu českého kulturního dědictví | cs |
| dc.type | článek | cs |
| dc.type | article | en |
| dc.type.status | Peer-reviewed | en |
| dc.type.version | publishedVersion | en |
Files
Original bundle
1 - 1 out of 1 results
No Thumbnail Available
- Name:
- JosefPsutka_SystemforFast.pdf
- Size:
- 271.7 KB
- Format:
- Adobe Portable Document Format
- Description:
- Plný text
License bundle
1 - 1 out of 1 results
No Thumbnail Available
- Name:
- license.txt
- Size:
- 1.71 KB
- Format:
- Item-specific license agreed upon to submission
- Description: