Audio-visual speech asynchrony modeling in a talking head

Karpov, Alexey

Audio-visual speech asynchrony modeling in a talking head

dc.contributor.author	Karpov, Alexey
dc.contributor.author	Tsirulnik, Liliya
dc.contributor.author	Krňoul, Zdeněk
dc.contributor.author	Ronzhin, Andrey
dc.contributor.author	Lobanov, Boris
dc.contributor.author	Železný, Miloš
dc.date.accessioned	2016-01-11T05:58:40Z
dc.date.available	2016-01-11T05:58:40Z
dc.date.issued	2009
dc.description.abstract	V tomto článku je navržen systém audiovizuální syntézy řeči obsahující modelování asynchronie mezi zvukovou a vizuální modalitou řeči. Studie reálných nahrávek obsažených v řečových databázích nám poskytují požadované údaje k pochopení problému modalit asynchronie, která je částečně způsobena koartikulací. Byl vypracován soubor kontextově závislých pravidel časování a doporučení zajišťující synchronizaci zvukové a vizuální řeči tak, že animace mluvící hlavy je více přirozená. Kognitivní ohodnocení systému mluvící hlavy, který je nastaven pro Ruštinu a implementující původní model asynchronie, ukazuje vysokou srozumitelnost a přirozenost syntetizované audiovizuální řeči.	cs
dc.description.abstract-translated	An audio-visual speech synthesis system with modeling of asynchrony between auditory and visual speech modalities is proposed in the paper. Corpus-based study of real recordings gave us the required data for understanding the problem of modalities asynchrony that is partially caused by the coarticulationphenomena. A set of context-dependent timing rules and recommendations was elaborated in order to make a synchronization of auditory and visual speech cues of the animated talking head similar to a natural humanlike way. The cognitive evaluation of the model-based talking head for Russian with implementation of the original asynchrony model has shown high intelligibility and naturalness of audio-visual synthesized speech.	en
dc.format	4 s.	cs
dc.format.mimetype	application/pdf
dc.identifier.citation	KARPOV, Alexey; TSIRULNIK, Liliya; KRŇOUL, Zdeněk; RONZHIN, Andrey; LOBANOV, Boris; ŽELEZNÝ, Miloš. Audio-visual speech asynchrony modeling in a talking head. In: Proceedings of ICSPL 2009: 10th Annual Conference of the International Speech Communication Association 2009, 6-10 September 2009, Brighton, UK. [Baixas]: ISCA, 2009, p. 2911-2914. ISSN 1990-9772.	en
dc.identifier.issn	1990-9772
dc.identifier.uri	http://www.kky.zcu.cz/cs/publications/AlexeyKarpov_2009_Audio-VisualSpeech
dc.identifier.uri	http://hdl.handle.net/11025/17205
dc.language.iso	en	en
dc.publisher	ISCA	en
dc.rights	© ISCA	cs
dc.rights.access	openAccess	en
dc.subject	automatické rozpoznávání řeči	cs
dc.subject	syntéza řeči	cs
dc.subject	multimodální vjem řeči	cs
dc.subject	kognitivní studie	cs
dc.subject.translated	audio-visual speech processing	en
dc.subject.translated	speech synthesis	en
dc.subject.translated	multimodal speech perception	en
dc.subject.translated	cognitive study	en
dc.title	Audio-visual speech asynchrony modeling in a talking head	en
dc.title.alternative	Modelování asynchnie v systému mluvící hlavy	cs
dc.type	článek	cs
dc.type	article	en
dc.type.status	Peer-reviewed	en
dc.type.version	publishedVersion	en

Files

Original bundle

Showing 1 - 1 out of 1 results

Name:: AlexeyKarpov_2009_Audio-VisualSpeech.pdf
Size:: 254.9 KB
Format:: Adobe Portable Document Format
Description:: Plný text

Download

License bundle

Showing 1 - 1 out of 1 results

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Articles (NTIS)