A reliable resume-job matching system helps a company find suitable candidates from a pool of resumes, and helps a job seeker find relevant jobs from a list of job posts. However, since job seekers apply only to a few jobs, interaction records in resume-job datasets are sparse. Different from many prior work that use complex modeling techniques, we tackle this sparsity problem using data augmentations and a simple contrastive learning approach. ConFit first creates an augmented resume-job dataset by paraphrasing specific sections in a resume or a job post. Then, ConFit uses contrastive learning to further increase training samples from $B$ pairs per batch to $O(B^2)$ per batch. We evaluate ConFit on two real-world datasets and find it outperforms prior methods (including BM25 and OpenAI text-ada-002) by up to 19% and 31% absolute in nDCG@10 for ranking jobs and ranking resumes, respectively.
翻译:可靠的简历-岗位匹配系统能帮助公司从简历库中筛选合适候选人,同时协助求职者从岗位列表中定位相关工作。然而,由于求职者通常仅申请少量岗位,简历-岗位数据集中的交互记录较为稀疏。不同于以往多数采用复杂建模技术的研究,我们通过数据增强与简单对比学习方法解决这一稀疏性问题。ConFit首先通过改写简历或岗位说明中的特定章节,构建增强型简历-岗位数据集;随后利用对比学习将每批训练样本从$B$对扩展至$O(B^2)$对。我们在两个真实数据集上评估ConFit,发现其在岗位排序与简历排序任务中的nDCG@10指标上,分别较BM25和OpenAI文本模型text-ada-002等现有方法实现最高19%和31%的绝对性能提升。