Comparing Human and AI-based Essay Evaluation in the Czech Higher Education: Challenges and Limitations

dc.contributor.authorKincl, Tomáš
dc.contributor.authorGunina, Daria
dc.contributor.authorNovák, Michal
dc.contributor.authorPospíšil, Jan
dc.date.accessioned2025-01-29T12:27:06Z
dc.date.available2025-01-29T12:27:06Z
dc.date.issued2024
dc.description.abstract-translatedGenerative artificial intelligence (GenAI) tools offer innovative capabilities for addressing a wide array of tasks involving extensive datasets, both textual and non-textual. These tools have shown remarkable potential in the field of education, where their functionalities are increasingly leveraged not only by students but also by educators. This study investigates the extent to which human evaluator assessments align with automated evaluations conducted by large language models, with a focus on a) the complexity of the evaluated texts (academic essays that encompass literature reviews, critical assessments of sources, and reflective insights within the context of societal or economic practices) and b) the unique challenges posed by the Czech language, in which the evaluated works are submitted. The research adopts a quantitative (cross-sectional) approach, analysing 30 essays submitted as an assignment for a foundational theoretical course at the master's level. These essays were evaluated by a human evaluator and subsequently by virtual assistants utilizing large language models, specifically ChatGPT (paid version 4.0) and Claude (paid version Sonet 3.5). Statistical analysis revealed that there is a significant statistical difference between human evaluator and both automated systems. Moreover, the evaluations were not consistent when distinguishing between good and less good essays. We also discussed challenges and limitations of using GenAI tools for evaluating submitted text assignments in the context of tertiary education.en
dc.description.sponsorshipThis study was conducted as a part of the International Visegrad Fund no 22410207: Innovation of the education process towards implementing AI toolsen
dc.format10 s.cs
dc.format.mimetypeapplication/pdf
dc.identifier.doihttps://doi.org/10.24132/jbt.2024.14.2.25_34
dc.identifier.issn2788-0079
dc.identifier.urihttp://hdl.handle.net/11025/58136
dc.language.isoenen
dc.publisherZápadočeská univerzita v Plznics
dc.rights© Západočeská univerzita v Plznics
dc.rights.accessopenAccessen
dc.subjectautomatické vyhodnocování esejícs
dc.subjectgenerativní AIcs
dc.subjectChatGPTcs
dc.subjectterciární vzdělánícs
dc.subject.translatedautomated essay evaluationen
dc.subject.translatedgenerative AIen
dc.subject.translatedChatGPTen
dc.subject.translatedtertiary educationen
dc.titleComparing Human and AI-based Essay Evaluation in the Czech Higher Education: Challenges and Limitationsen
dc.typečlánekcs
dc.typearticleen
dc.type.statusPeer-revieweden
dc.type.versionpublishedVersionen
local.files.count1*
local.files.size287196*
local.has.filesyes*

Files

Original bundle
Showing 1 - 1 out of 1 results
No Thumbnail Available
Name:
4_Kincl_Gunina_Novak_Pospisil.pdf
Size:
280.46 KB
Format:
Adobe Portable Document Format
License bundle
Showing 1 - 1 out of 1 results
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:
OPEN License Selector