Acta Informatica Pragensia - Forthcoming articles
SKR1: Benchmark for Testing Knowledge About Slovak Realia for Large Language Models
Marek Dobeš
Acta Informatica Pragensia X:X | DOI: 10.18267/j.aip.30066 
Background: To objectively evaluate the capabilities of large language models (LLMs), we need to develop tools that enable such assessment. While numerous benchmarks exist, the vast majority are in English and focus on general knowledge, often overlooking the cultural and factual specifics of smaller countries.Objective: Currently, there is no benchmark that tests LLMs΄ knowledge of Slovak realia. At the same time, LLM performance in this domain remains inadequate. To objectively measure and compare these capabilities, our goal is to develop and validate a specialized benchmark for assessing LLMs΄ knowledge of Slovak cultural and factual...
