On self-supervision in historical handwritten document segmentation

dc.contributor.authorBaloun, Josef
dc.contributor.authorPrantl, Martin
dc.contributor.authorLenc, Ladislav
dc.contributor.authorMartínek, Jiří
dc.contributor.authorKrál, Pavel
dc.date.accessioned2026-03-31T18:05:25Z
dc.date.available2026-03-31T18:05:25Z
dc.date.issued2025
dc.date.updated2026-03-31T18:05:25Z
dc.description.abstractHistorical document analysis plays a crucial role in understanding and preserving our past. However, this task is oftenhindered by challenges such as limited annotated training data and the diverse nature of historical handwritten documents. Inthis paper,we explore the potential of self-supervised learning (SSL) in historical document analysis,with a particular focus onhistorical handwritten document segmentation, to overcome the need for extensive annotated data while enhancing efficiencyand robustness. We present an overview of SSL methods suitable for historical document analysis and discuss their potentialapplications and benefits. Furthermore, we present an approach for SSL in the document domain, considering various setups,augmentations, and resolutions. We also provide experimental results that demonstrate its feasibility and effectiveness. Ourfindings indicate that most document segmentation tasks can be effectively addressed using SSL features, highlighting thepotential of SSL to advance historical document analysis and pave the way for more efficient and robust document processingworkflows.en
dc.format16
dc.identifier.document-number001520730500001
dc.identifier.doi10.1007/s10032-025-00538-6
dc.identifier.issn1433-2833
dc.identifier.obd43946890
dc.identifier.orcidBaloun, Josef 0000-0003-1923-5355
dc.identifier.orcidPrantl, Martin 0000-0002-7900-5028
dc.identifier.orcidLenc, Ladislav 0000-0002-1066-7269
dc.identifier.orcidMartínek, Jiří 0000-0003-2981-1723
dc.identifier.orcidKrál, Pavel 0000-0002-3096-675X
dc.identifier.urihttp://hdl.handle.net/11025/67479
dc.language.isoen
dc.project.IDEH23_021/0008436
dc.relation.ispartofseriesInternational Journal on Document Analysis and Recognition
dc.rights.accessA
dc.subjecthistorical handwritten documenten
dc.subjectself-supervised learningen
dc.subjectdocument digitizationen
dc.subjectsemantic segmentationen
dc.titleOn self-supervision in historical handwritten document segmentationen
dc.typeČlánek v databázi WoS (Jimp)
dc.typeČLÁNEK
dc.type.statusPublished Version
local.files.count1*
local.files.size13302218*
local.has.filesyes*
local.identifier.eid2-s2.0-105009863832

Files

Original bundle
Showing 1 - 1 out of 1 results
No Thumbnail Available
Name:
s10032-025-00538-6.pdf
Size:
12.69 MB
Format:
Adobe Portable Document Format
License bundle
Showing 1 - 1 out of 1 results
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections