Finding the most relevant person for a job proposal in real time is challenging, especially when resumes are long, structured, and multilingual. In this paper, we propose a re-ranking model based on a new generation of late cross-attention architecture, that decomposes both resumes and project briefs to efficiently handle long-context inputs with minimal computational overhead. To mitigate historical data biases, we use a generative large language model (LLM) as a teacher, generating fine-grained, semantically grounded supervision. This signal is distilled into our student model via an enriched distillation loss function. The resulting model produces skill-fit scores that enable consistent and interpretable person-job matching. Experiments on relevance, ranking, and calibration metrics demonstrate that our approach outperforms state-of-the-art baselines.
翻译:在实时为职位提案寻找最相关人选时,若简历具有篇幅长、结构化且多语言的特点,则面临显著挑战。本文提出一种基于新一代延迟交叉注意力架构的重排序模型,该架构通过对简历和项目简介进行分解,能以最小计算开销高效处理长上下文输入。为减轻历史数据偏差,我们采用生成式大语言模型作为教师模型,生成细粒度、语义可解释的监督信号。该信号通过增强的蒸馏损失函数被提炼至学生模型中。最终模型生成的技能匹配分数能够实现一致且可解释的人岗匹配。在相关性、排序和校准指标上的实验表明,本方法优于现有最先进的基线模型。