COMICORDA: Dialogue Act Recognition in Comic Books

Martínek, Jiří

COMICORDA: Dialogue Act Recognition in Comic Books

dc.contributor.author	Martínek, Jiří
dc.contributor.author	Baloun, Josef
dc.contributor.author	Prantl, Martin
dc.contributor.author	Lenc, Ladislav
dc.contributor.author	Král, Pavel
dc.date.accessioned	2025-06-20T08:38:52Z
dc.date.available	2025-06-20T08:38:52Z
dc.date.issued	2024
dc.date.updated	2025-06-20T08:38:52Z
dc.description.abstract	Dialogue act (DA) recognition is usually realized from a speech signal that is transcribed and segmented into text. However, only a little work in DA recognition from images exists. Therefore, this paper concentrates on this modality and presents a novel DA recognition approach for image documents, namely comic books. To the best of our knowledge, this is the first study investigating dialogue acts from comic books and represents the first steps to building a model for comic book understanding. The proposed method is composed of the following steps: speech balloon segmentation, optical character recognition (OCR), and DA recognition itself. We use YOLOv8 for balloon segmentation, Google Vision for OCR, and Transformer-based models for DA classification. The experiments are performed on a newly created dataset comprising 1,438 annotated comic panels. It contains bounding boxes, transcriptions, and dialogue act annotation. We have achieved nearly 98% average precision for speech balloon segmentation and exceeded the accuracy of 70% for the DA recognition task. We also present an analysis of dialogue structure in the comics domain and compare it with the standard DA datasets, representing another contribution of this paper.	en
dc.format	13
dc.identifier.isbn	978-2-493-81410-4
dc.identifier.issn	2951-2093
dc.identifier.obd	43943632
dc.identifier.orcid	Martínek, Jiří 0000-0003-2981-1723
dc.identifier.orcid	Baloun, Josef 0000-0003-1923-5355
dc.identifier.orcid	Prantl, Martin 0000-0002-7900-5028
dc.identifier.orcid	Lenc, Ladislav 0000-0002-1066-7269
dc.identifier.orcid	Král, Pavel 0000-0002-3096-675X
dc.identifier.uri	http://hdl.handle.net/11025/60609
dc.language.iso	en
dc.project.ID	SGS-2022-016
dc.publisher	ELRA and ICCL
dc.relation.ispartofseries	Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024
dc.subject	comics processing	en
dc.subject	dialogue act recognition	en
dc.subject	OCR	en
dc.subject	speech balloon segmentation	en
dc.subject	YOLOv8	en
dc.title	COMICORDA: Dialogue Act Recognition in Comic Books	en
dc.type	Stať ve sborníku (D)
dc.type	STAŤ VE SBORNÍKU
dc.type.status	Published Version
local.files.count	1	*
local.files.size	694869	*
local.has.files	yes	*
local.identifier.eid	2-s2.0-85195960167

Files

Original bundle

Showing 1 - 1 out of 1 results

Name:: 2024.lrec-main.316.pdf
Size:: 678.58 KB
Format:: Adobe Portable Document Format

Download

License bundle

Showing 1 - 1 out of 1 results

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Conference papers (NTIS)