LLM agents increasingly present as conversational collaborators, yet human--agent teamwork remains brittle due to information asymmetry: users lack task-specific reliability cues, and agents rarely surface calibrated uncertainty or rationale. We propose a task-aware collaboration signaling layer that turns offline preference evaluations into online, user-facing primitives for delegation. Using Chatbot Arena pairwise comparisons, we induce an interpretable task taxonomy via semantic clustering, then derive (i) Capability Profiles as task-conditioned win-rate maps and (ii) Coordination-Risk Cues as task-conditioned disagreement (tie-rate) priors. These signals drive a closed-loop delegation protocol that supports common-ground verification, adaptive routing (primary vs.\ primary+auditor), explicit rationale disclosure, and privacy-preserving accountability logs. Two predictive probes validate that task typing carries actionable structure: cluster features improve winner prediction accuracy and reduce difficulty prediction error under stratified 5-fold cross-validation. Overall, our framework reframes delegation from an opaque system default into a visible, negotiable, and auditable collaborative decision, providing a principled design space for adaptive human--agent collaboration grounded in mutual awareness and shared accountability.
翻译:随着LLM智能体日益成为对话式协作伙伴,人机协作仍因信息不对称而脆弱:用户缺乏任务特定的可靠性提示,智能体鲜少呈现校准后的不确定性或决策依据。我们提出一种任务感知的协作信号层,将离线偏好评估转化为面向用户的在线委托原语。基于Chatbot Arena的成对比较数据,我们通过语义聚类推导出可解释的任务分类体系,进而生成:(i) 能力画像——任务条件胜率图谱,以及(ii) 协调风险提示——任务条件分歧(平局率)先验。这些信号驱动闭环委托协议,支持共识验证、自适应路由(主执行器 vs. 主执行器+审计器)、显式依据披露及隐私保护的责任日志。两项预测性实验验证了任务分类具备可操作结构:在分层五折交叉验证中,聚类特征能提升胜者预测准确率并降低难度预测误差。整体而言,我们的框架将委托机制从隐性的系统默认设置重构为可见、可协商、可审计的协作决策,为基于相互认知与共担责任的自适应人机协作提供了原则性设计空间。