Finding the most relevant person for a job proposal in real time is challenging, especially when resumes are long, structured, and multilingual. In this paper, we propose a re-ranking model based on a new generation of late cross-attention architecture, that decomposes both resumes and project briefs to efficiently handle long-context inputs with minimal computational overhead. To mitigate historical data biases, we use a generative large language model (LLM) as a teacher, generating fine-grained, semantically grounded supervision. This signal is distilled into our student model via an enriched distillation loss function. The resulting model produces skill-fit scores that enable consistent and interpretable person-job matching. Experiments on relevance, ranking, and calibration metrics demonstrate that our approach outperforms state-of-the-art baselines.
翻译:在简历内容冗长、结构复杂且多语言混杂的情况下,实时为招聘需求匹配最合适的人选极具挑战性。本文提出一种基于新一代延迟交叉注意力架构的重排序模型,该架构通过对简历和项目简介进行分解,能以最低计算开销高效处理长上下文输入。为减少历史数据偏差,我们采用生成式大语言模型作为教师模型,生成细粒度、语义可解释的监督信号。该信号通过增强的蒸馏损失函数被提炼至学生模型中。最终模型生成的技能匹配分数能够实现一致且可解释的人岗匹配。在相关性、排序和校准指标上的实验表明,本方法优于当前最先进的基线模型。