Leveraging LLMs For Turkish Skill Extraction

Skill extraction is a critical component of modern recruitment systems, enabling efficient job matching, personalized recommendations, and labor market analysis. Despite Türkiye's significant role in the global workforce, Turkish, a morphologically complex language, lacks both a skill taxonomy and a dedicated skill extraction dataset, resulting in underexplored research in skill extraction for Turkish. This article seeks the answers to three research questions: 1) How can skill extraction be effectively performed for this language, in light of its low resource nature? 2)~What is the most promising model? 3) What is the impact of different Large Language Models (LLMs) and prompting strategies on skill extraction (i.e., dynamic vs. static few-shot samples, varying context information, and encouraging causal reasoning)? The article introduces the first Turkish skill extraction dataset and performance evaluations of automated skill extraction using LLMs. The manually annotated dataset contains 4,819 labeled skill spans from 327 job postings across different occupation areas. The use of LLM outperforms supervised sequence labeling when used in an end-to-end pipeline, aligning extracted spans with standardized skills in the ESCO taxonomy more effectively. The best-performing configuration, utilizing Claude Sonnet 3.7 with dynamic few-shot prompting for skill identification, embedding-based retrieval, and LLM-based reranking for skill linking, achieves an end-to-end performance of 0.56, positioning Turkish alongside similar studies in other languages, which are few in the literature. Our findings suggest that LLMs can improve skill extraction performance in low-resource settings, and we hope that our work will accelerate similar research on skill extraction for underrepresented languages.

翻译：技能提取是现代招聘系统的关键组成部分，能够实现高效的职位匹配、个性化推荐和劳动力市场分析。尽管土耳其在全球劳动力市场中扮演着重要角色，但土耳其语作为一种形态复杂的语言，既缺乏技能分类体系，也缺少专门的技能提取数据集，导致针对土耳其语的技能提取研究尚未充分展开。本文旨在回答三个研究问题：1）鉴于土耳其语资源匮乏的特性，如何有效实现该语言的技能提取？2）最具前景的模型是什么？3）不同的大型语言模型（LLMs）及提示策略（如动态与静态少样本示例、不同上下文信息、以及鼓励因果推理）对技能提取有何影响？本文首次引入了土耳其语技能提取数据集，并评估了使用LLMs进行自动化技能提取的性能。该人工标注数据集包含来自327个不同职业领域招聘广告的4,819个已标注技能片段。在端到端流程中使用LLM时，其表现优于监督式序列标注方法，能更有效地将提取的技能片段与ESCO分类体系中的标准化技能对齐。最佳性能配置采用Claude Sonnet 3.7模型，通过动态少样本提示进行技能识别，结合基于嵌入的检索和基于LLM的重排序进行技能链接，实现了0.56的端到端性能，使土耳其语技能提取研究达到与少数文献中其他语言同类研究相当的水平。我们的研究结果表明，LLMs能够提升低资源场景下的技能提取性能，希望这项工作能加速针对资源不足语言的技能提取研究。