Enhancing Masked Language Modeling in BERT Models Using Pretrained Static Embeddings

Mištera, Adam

Enhancing Masked Language Modeling in BERT Models Using Pretrained Static Embeddings

dc.contributor.author	Mištera, Adam
dc.contributor.author	Král, Pavel
dc.date.accessioned	2026-04-24T18:06:04Z
dc.date.available	2026-04-24T18:06:04Z
dc.date.issued	2026
dc.date.updated	2026-04-24T18:06:04Z
dc.description.abstract	This paper explores the integration of pretrained static fastText word vectors into a simplified Transformer-based model to improve its efficiency and accuracy. Despite the fact that these embeddings have been outperformed by large models based on the Transformer architecture, they can still contribute useful linguistic information, when combined with contextual models, especially in low resource or computationally constrained environments. We demonstrate this by incorporating static embeddings directly into our own BERTTINY-based models prior to pretraining using masked language modeling. In this paper, we train the models on seven different languages covering three distinct language families. The results show that the use of static fastText embeddings in these models not only improves convergence for all tested languages, but also significantly improves their evaluation accuracy.	en
dc.description.abstract	Tento článek zkoumá integraci předem natrénovaných statických slovních vektorů fastText do zjednodušeného modelu založeného na transformátoru s cílem zlepšit jeho účinnost a přesnost. Přestože tyto vnoření slov byly překonány velkými modely založenými na architektuře transformátoru, mohou v kombinaci s kontextovými modely stále přispívat užitečnými lingvistickými informacemi, zejména v prostředích s omezenými zdroji nebo výpočetními možnostmi. To demonstrujeme začleněním statických vnoření slov přímo do našich vlastních modelů založených na BERTTINY před jejich předtrénováním pomocí maskovaného jazykového modelování. V tomto článku trénujeme modely na sedmi různých jazycích pokrývajících tři odlišné jazykové rodiny. Výsledky ukazují, že použití statických fastText reprezentací v těchto modelech nejen zlepšuje konvergenci pro všechny testované jazyky, ale také významně zlepšuje jejich přesnost.	cz
dc.format	12
dc.identifier.document-number	001576349100018
dc.identifier.doi	10.1007/978-3-032-02551-7_19
dc.identifier.isbn	978-3-032-02550-0
dc.identifier.issn	0302-9743
dc.identifier.obd	43948239
dc.identifier.orcid	Mištera, Adam 0009-0000-1019-9218
dc.identifier.orcid	Král, Pavel 0000-0002-3096-675X
dc.identifier.uri	http://hdl.handle.net/11025/67839
dc.language.iso	en
dc.project.ID	SGS-2025-022
dc.publisher	Springer
dc.relation.ispartofseries	28th International Conference on Text, Speech, and Dialogue, TSD 2025
dc.subject	transformers	en
dc.subject	embeddings	en
dc.subject	pretraining	en
dc.subject	transformátory	cz
dc.subject	vnoření slov	cz
dc.subject	předtrénování	cz
dc.title	Enhancing Masked Language Modeling in BERT Models Using Pretrained Static Embeddings	en
dc.title	Vylepšení maskovaného jazykového modelování v modelech BERT pomocí předtrénovaných statických vnoření slov	cz
dc.type	Stať ve sborníku (D)
dc.type	STAŤ VE SBORNÍKU
dc.type.status	Published Version
local.files.count	1	*
local.files.size	469104	*
local.has.files	yes	*
local.identifier.eid	2-s2.0-105014393405

Files

Original bundle

Showing 1 - 1 out of 1 results

Name:: Mištera, Král Enhancing_Masked_Language_Modeling_in_BERT_Models_Using_Pretrained_Static_Embeddings.pdf
Size:: 458.11 KB
Format:: Adobe Portable Document Format

Download

License bundle

Showing 1 - 1 out of 1 results

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Conference Papers (KIV)