Large language model (LLM) agents increasingly rely on reusable external skills to solve long-horizon interactive tasks. Existing training-free skill adaptation pipelines usually update skills from full trajectories or session-level feedback, which makes failure attribution coarse and often produces unstable or overly broad revisions. We propose SkillAdaptor, a training-free step-level skill adaptation framework with explicit failure attribution, and it can plug into OpenClaw-class agent harnesses. Given a failed trajectory, SkillAdaptor identifies a first actionable fault step, links responsibility to candidate skills, and applies targeted updates under explicit acceptance checks while keeping the backbone frozen. We evaluate on WebShop, PinchBench, and Claw-Eval with Kimi-K2.5, GLM-5, and GPT-5.2. SkillAdaptor improves over no-skill and skill-adaptation baselines on all three suites, with the largest single-metric improvements of +1.5 points on PinchBench Avg Score%, +1.8 on Claw-Eval Avg Score, and +1.7 on WebShop success rate. These results indicate that step-level attribution supports more stable and auditable training-free skill maintenance\footnote{The code will be released at https://github.com/zjunlp/SkillAdaptor.}.
翻译:大语言模型智能体日益依赖可重用的外部技能来解决长程交互任务。现有免训练技能适配流程通常基于完整轨迹或会话级反馈更新技能,导致故障归因粗粒度,且常产生不稳定或过度泛化的修订。我们提出SkillAdaptor——一种具有显式故障归因能力的免训练步骤级技能适配框架,可即插即用于OpenClaw类智能体框架。给定失败轨迹时,SkillAdaptor定位首个可操作故障步骤,将责任关联至候选技能,并在显式验收检查下实施定向更新,同时保持基座模型冻结。我们在WebShop、PinchBench和Claw-Eval上,结合Kimi-K2.5、GLM-5和GPT-5.2进行评估。SkillAdaptor在所有三个评测套件上均超越无技能和技能适配基线,其中单指标最大提升分别为:PinchBench平均分提升1.5个百分点、Claw-Eval平均分提升1.8分、WebShop成功率提升1.7个百分点。结果表明,步骤级归因可支撑更稳定且可审计的免训练技能维护。\footnote{代码将发布于https://github.com/zjunlp/SkillAdaptor。}