Classifying Latin Inscriptions of the Roman Empire: A Machine-Learning Approach

dc.contributor.authorEhrmann, Maud
dc.contributor.authorKaše, Vojtěch
dc.contributor.authorKarsdorp, Folgert
dc.contributor.authorHeřmánková, Petra
dc.contributor.authorWevers, Melvin
dc.contributor.authorSobotková, Adéla
dc.contributor.authorAndrews, Tara Lee
dc.contributor.authorBurghardt, Manuel
dc.contributor.authorKestemont, Mike
dc.contributor.authorManjavacas, Enrique
dc.contributor.authorPiotrowski, Michael
dc.contributor.authorvan Zundert, Joris
dc.date.accessioned2022-02-14T11:00:15Z
dc.date.available2022-02-14T11:00:15Z
dc.date.issued2021
dc.description.abstract-translatedLarge-scale synthetic research in ancient history is often hindered by the incompatibility of tax- onomies used by different digital datasets. Using the example of enriching the Latin Inscriptions from the Roman Empire dataset (LIRE), we demonstrate that machine-learning classification mod- els can bridge the gap between two distinct classification systems and make comparative study possible. We report on training, testing and application of a machine learning classification model using inscription categories from the Epigraphic Database Heidelberg (EDH) to label inscriptions from the Epigraphic Database Claus-Slaby (EDCS). The model is trained on a labeled set of records included in both sources (N=46,171). Several different classification algorithms and parametriza- tions are explored. The final model is based on Extremely Randomized Trees algorithm (ET) and employs 10,055 features, based on several attributes. The final model classifies two thirds of a test dataset with 98% accuracy and 85% of it with 95% accuracy. After model selection and evaluation, we apply the model on inscriptions covered exclusively by EDCS (N=83,482) in an attempt to adopt one consistent system of classification for all records within the LIRE dataset.en
dc.format13 s.cs
dc.format.mimetypeapplication/pdf
dc.identifier.citationKAŠE, V. HEŘMÁNKOVÁ, P. SOBOTKOVÁ, A. Classifying Latin Inscriptions of the Roman Empire: A Machine-Learning Approach. In Ehrmann, M., Karsdorp, F., Wevers, M. Proceedings of the Conference on Computational Humanities Research 2021. Amsterdam: CEUR-WS, 2021. s. 123-135. ISBN: neuvedeno , ISSN: 1613-0073cs
dc.identifier.isbnneuvedeno
dc.identifier.issn1613-0073
dc.identifier.obd43933987
dc.identifier.urihttp://hdl.handle.net/11025/46904
dc.language.isoenen
dc.publisherCEUR-WSen
dc.relation.ispartofseriesProceedings of the Conference on Computational Humanities Research 2021en
dc.rights© authorsen
dc.rights.accessopenAccessen
dc.subject.translatedLatin inscriptionsen
dc.subject.translateddocument classificationen
dc.subject.translatedcomparative analysisen
dc.subject.translatedRoman Empireen
dc.titleClassifying Latin Inscriptions of the Roman Empire: A Machine-Learning Approachen
dc.typekonferenční příspěvekcs
dc.typeConferenceObjecten
dc.type.statusPeer-revieweden
dc.type.versionpublishedVersionen

Files