In this work we will show that language models with less than one billion parameters can be used to translate natural language to SPARQL queries after fine-tuning. Using three different datasets ranging from academic to real world, we identify prerequisites that the training data must fulfill in order for the training to be successful. The goal is to empower users of semantic web technology to use AI assistance with affordable commodity hardware, making them more resilient against external factors.
翻译:在本工作中,我们将证明经过微调后,参数量少于十亿的语言模型能够将自然语言转换为SPARQL查询。通过使用涵盖学术领域到现实场景的三个不同数据集,我们明确了训练数据必须满足的关键前提条件,以确保训练成功。该研究旨在使语义网技术用户能够利用经济实惠的商用硬件获得AI辅助,从而提升其应对外部因素的韧性。