Knowledge Graphs popularity has been rapidly growing in last years. All that knowledge is available for people to query it through the many online databases on the internet. Though, it would be a great achievement if non-programmer users could access whatever information they want to know. There has been a lot of effort oriented to solve this task using natural language processing tools and creativity encouragement by way of many challenges. Our approach focuses on assuming a correct entity linking on the natural language questions and training a GPT model to create SPARQL queries from them. We managed to isolate which property of the task can be the most difficult to solve at few or zero-shot and we proposed pre-training on all entities (under CWA) to improve the performance. We obtained a 62.703% accuracy of exact SPARQL matches on testing at 3-shots, a F1 of 0.809 on the entity linking challenge and a F1 of 0.009 on the question answering challenge.
翻译:近年来,知识图谱的受欢迎程度迅速增长。人们可通过互联网上的众多在线数据库查询这些知识。然而,若能让非编程用户也能获取他们想知道的任何信息,这将是一项重大成就。通过自然语言处理工具及各类挑战赛的创意激发,已有大量研究致力于解决该任务。我们的方法侧重于对自然语言问题中的实体进行正确链接,并训练GPT模型据此生成SPARQL查询。我们成功分离了该任务中在少样本或零样本场景下最难解决的属性,并提出对所有实体(在封闭世界假设下)进行预训练以提升性能。在3-shot测试中,我们获得了62.703%的精确SPARQL匹配准确率,实体链接挑战的F1值为0.809,问答挑战的F1值为0.009。