Abstract:
In schools and universities, learners usually work on more tasks than teachers can correct. Where humans reach their limits, artificial intelligence (AI) can step in and assist with routine tasks
or even take them over completely. This article examines the extent to which an AI is able to correctly evaluate school assignments in the subject of chemistry. To this end, the assessment quality of humans and machines is compared and contrasted across several classic curricular subject areas. Furthermore, it is shown that performance can be significantly improved through prompts and targeted fine-tuning of the AI model. By training with 130 data sets per question, the AI models achieve almost comparable assessment quality to human assessors, although the quality differs depending on the requirement area.