Unified Work Embeddings: Contrastive Learning of a Bidirectional Multi-task Ranker

Applications in labor market intelligence demand specialized NLP systems for a wide range of tasks, characterized by extreme multi-label target spaces, strict latency constraints, and multiple text modalities such as skills and job titles. These constraints have led to isolated, task-specific developments in the field, with models and benchmarks focused on single prediction tasks. Exploiting the shared structure of work-related data, we propose a unifying framework, combining a wide range of tasks in a multi-task ranking benchmark, and a flexible architecture tackling text-driven work tasks with a single model. The benchmark, WorkBench, is the first unified evaluation suite spanning six work-related tasks formulated explicitly as ranking problems, curated from real-world ontologies and human-annotated resources. WorkBench enables cross-task analysis, where we find significant positive cross-task transfer. This insight leads to Unified Work Embeddings (UWE), a task-agnostic bi-encoder that exploits our training-data structure with a many-to-many InfoNCE objective, and leverages token-level embeddings with task-agnostic soft late interaction. UWE demonstrates zero-shot ranking performance on unseen target spaces in the work domain, and enables low-latency inference with two orders of magnitude fewer parameters than best-performing generalist models (Qwen3-8B), with +4.4 MAP improvement.

翻译：劳动力市场情报应用要求专门的自然语言处理系统处理广泛的任务，这些任务以极端多标签目标空间、严格的延迟约束以及技能和职位名称等多种文本模态为特征。这些限制导致该领域出现孤立的、特定任务的发展，模型和基准专注于单一预测任务。利用工作相关数据的共享结构，我们提出了一个统一框架，该框架结合了多任务排序基准中的广泛任务，以及通过单一模型处理文本驱动工作任务的灵活架构。该基准WorkBench是首个统一的评估套件，涵盖六项明确表述为排序问题的工作相关任务，这些任务从真实世界本体和人工标注资源中精心整理而成。WorkBench能够进行跨任务分析，我们发现显着的正向跨任务迁移。这一见解催生了统一工作嵌入（UWE），这是一种任务无关的双编码器，利用我们的训练数据结构，采用多对多InfoNCE目标，并利用具有任务无关软后期交互的令牌级嵌入。UWE在工作领域未见目标空间上展现出零样本排序性能，并且能够以比性能最佳通用模型（Qwen3-8B）少两个数量级的参数实现低延迟推理，平均精度均值（MAP）提升+4.4。

相关内容

排序

关注 313

排序是计算机内经常进行的一种操作，其目的是将一组“无序”的记录序列调整为“有序”的记录序列。分内部排序和外部排序。若整个排序过程不需要访问外存便能完成，则称此类排序问题为内部排序。反之，若参加排序的记录数量很大，整个序列的排序过程不可能在内存中完成，则称此类排序问题为外部排序。内部排序的过程是一个逐步扩大记录的有序序列长度的过程。

大语言模型智能体（LLM Agents）工具调用的演进：从单工具调用到多工具协同编排

专知会员服务

29+阅读 · 4月6日