Adapting LLMs for the Medical Domain in Portuguese: A Study on Fine-Tuning and Model Evaluation

dc.contributor.authorPaiola, Paiola
dc.contributor.authorGarcia, Gabriel Lino
dc.contributor.authorManesco, João Renato Ribeiro
dc.contributor.authorRoder, Mateus
dc.contributor.authorRodrigues, Douglas
dc.contributor.authorPapa, João Paulo
dc.contributor.editorSkala, Václav
dc.date.accessioned2025-07-30T10:46:47Z
dc.date.available2025-07-30T10:46:47Z
dc.date.issued2025
dc.description.abstract-translatedThis study evaluates the performance of large language models (LLMs) as medical agents in Portuguese, aiming to develop a reliable and relevant virtual assistant for healthcare professionals. The HealthCareMagic-100k-en and MedQuAD datasets, translated from English using GPT-3.5, were used to fine-tune the ChatBode-7B model using the PEFT-QLoRA method. The InternLM2 model, with initial training on medical data, presented the best overall performance, with high precision and adequacy in metrics such as accuracy, completeness, and safety. However, DrBode models, derived from ChatBode, exhibited a phenomenon of catastrophic forgetting of acquired medical knowledge. Despite this, these models performed frequently or even better in grammaticality and coherence. A significant challenge was low inter-rater agreement, highlighting the need for more robust assessment protocols. This work paves the way for future research, such as evaluating multilingual models specific to the medical field, improving the quality of training data, and developing more consistent evaluation methodologies for the medical field.en
dc.format4 s.cs
dc.format.mimetypeapplication/pdf
dc.identifier.doihttp://www.doi.org/10.24132/CSRN.2025-37
dc.identifier.issn2464-4617 (Print)
dc.identifier.issn2464-4625 (online)
dc.identifier.urihttp://hdl.handle.net/11025/62246
dc.language.isoenen
dc.publisherVaclav Skala - UNION Agencyen
dc.rights© Vaclav Skala - UNION Agencyen
dc.rights.accessopenAccessen
dc.subjectrozsáhlé jazykové modelycs
dc.subjectjemné doladěnícs
dc.subjectvirtuální lékařský asistentcs
dc.subjectbrazilská portugalštinacs
dc.subject.translatedlarge language modelsen
dc.subject.translatedfine-tuningen
dc.subject.translatedvirtual medical assistanten
dc.subject.translatedBrazilian Portugueseen
dc.titleAdapting LLMs for the Medical Domain in Portuguese: A Study on Fine-Tuning and Model Evaluationen
dc.typekonferenční příspěvekcs
dc.typeconferenceObjecten
dc.type.statusPeer revieweden
dc.type.versionpublishedVersionen
local.files.count1*
local.files.size747911*
local.has.filesyes*

Files

Original bundle
Showing 1 - 1 out of 1 results
No Thumbnail Available
Name:
A13.pdf
Size:
730.38 KB
Format:
Adobe Portable Document Format
License bundle
Showing 1 - 1 out of 1 results
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: