Person-job fit is an essential part of online recruitment platforms in serving various downstream applications like Job Search and Candidate Recommendation. Recently, pretrained large language models have further enhanced the effectiveness by leveraging richer textual information in user profiles and job descriptions apart from user behavior features and job metadata. However, the general domain-oriented design struggles to capture the unique structural information within user profiles and job descriptions, leading to a loss of latent semantic correlations. We propose TAROT, a hierarchical multitask co-pretraining framework, to better utilize structural and semantic information for informative text embeddings. TAROT targets semi-structured text in profiles and jobs, and it is co-pretained with multi-grained pretraining tasks to constrain the acquired semantic information at each level. Experiments on a real-world LinkedIn dataset show significant performance improvements, proving its effectiveness in person-job fit tasks.
翻译:人岗匹配是在线招聘平台服务于求职搜索和候选人推荐等下游应用的核心环节。近年来,预训练大语言模型通过利用用户画像和职位描述中更丰富的文本信息(除用户行为特征和职位元数据外),进一步提升了匹配效果。然而,通用领域的设计方案难以捕捉用户画像与职位描述中的独特结构信息,导致潜在语义关联的丢失。我们提出TAROT——一种分层多任务联合预训练框架,通过更充分地利用结构与语义信息生成富有信息量的文本嵌入。TAROT专门针对用户画像和职位描述中的半结构化文本,通过多粒度预训练任务协同约束各层级获取的语义信息。基于真实LinkedIn数据集的实验表明,该方法在人岗匹配任务中取得了显著的性能提升,验证了其有效性。