Design and recording of czech audio-visual database with impaired conditions for continuous speech recognition

dc.contributor.authorTrojanová, Jana
dc.contributor.authorHrúz, Marek
dc.contributor.authorCampr, Pavel
dc.contributor.authorŽelezný, Miloš
dc.date.accessioned2015-12-11T08:40:31Z
dc.date.available2015-12-11T08:40:31Z
dc.date.issued2008
dc.description.abstract-translatedIn this paper we discuss the design, acquisition and preprocessing of a Czech audio-visual speech corpus. The corpus is intended for training and testing of existing audio-visual speech recognition system. The name of the database is UWB-07-ICAVR, where ICAVR stands for Impaired Condition Audio Visual speech Recognition. The corpus consist of 10000 utterances of continuous speech obtained from 50 speakers. The total length of the database is 25 hours. Each utterance is stored as a separate sentence. The corpus extends existing databases by covering condition of variable illumination. We acquired 50 speakers, where half of them were men and half of them were women. Recording was done by two cameras and two microphones. Database introduced in this paper can be used for testing of visual parameterization in audio-visual speech recognition (AVSR). Corpus can be easily split into training and testing part. Each speaker pronounced 200 sentences: first 50 were the same for all, the rest of them were different. Six types of illumination were covered. Session for one speaker can fit on one DVD disk. All files are accompanied by visual labels. Labels specify region of interest (mouth and area around them specified by bounding box). Actual pronunciation of each sentence is transcribed into the text file.en
dc.format5 s.cs
dc.format.mimetypeapplication/pdf
dc.identifier.citationTROJANOVÁ, Jana; HRÚZ, Marek; CAMPR, Pavel; ŽELEZNÝ, Miloš. Design and recording of czech audio-visual database with impaired conditions for continuous speech recognition. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08): 28-29-30 May 2008. Marrakech: ELRA, 2008, p. [1-5]. ISBN 2-9517408-4-0.en
dc.identifier.isbn2-9517408-4-0
dc.identifier.urihttp://www.kky.zcu.cz/cs/publications/TrojanovaJ_2008_DesignandRecording
dc.identifier.urihttp://hdl.handle.net/11025/16964
dc.language.isoenen
dc.publisherELRAen
dc.rights© Jana Trojanová - Marek Hrúz - Pavel Campr - Miloš Železnýcs
dc.rights.accessopenAccessen
dc.subjectrozpoznávání řečics
dc.subjectčeská audiovizuální databázecs
dc.subject.translatedspeech recognitionen
dc.subject.translatedczech audio-visual databaseen
dc.titleDesign and recording of czech audio-visual database with impaired conditions for continuous speech recognitionen
dc.typečlánekcs
dc.typearticleen
dc.type.statusPeer-revieweden
dc.type.versionpublishedVersionen

Files

Original bundle
Showing 1 - 1 out of 1 results
No Thumbnail Available
Name:
TrojanovaJ_2008_DesignandRecording.pdf
Size:
1.24 MB
Format:
Adobe Portable Document Format
Description:
Plný text
License bundle
Showing 1 - 1 out of 1 results
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: