We present a winning three-stage system for SemEval 2026 Task~12: Abductive Event Reasoning that combines graph-based retrieval, LLM-driven abductive reasoning with prompt design optimized through reflective prompt evolution, and post-hoc consistency enforcement; our system ranks first on the evaluation-phase leaderboard with an accuracy score of 0.95. Cross-model error analysis across 14 models (7~families) reveals three shared inductive biases: causal chain incompleteness, proximate cause preference, and salience bias, whose cross-family convergence (51\% cause-count reduction) indicates systematic rather than model-specific failure modes in multi-label causal reasoning.
翻译:本文提出一个在SemEval 2026任务12(溯因事件推理)中获胜的三阶段系统,该系统融合了基于图结构的检索机制、通过反思式提示演化优化的LLM驱动溯因推理提示设计,以及后验一致性强化机制;本系统在评估阶段排行榜中以0.95的准确率位列第一。通过对14个模型(涵盖7个模型族)的跨模型误差分析,揭示了三种共有的归纳偏误:因果链不完整性、近因偏好和显著性偏误,这些偏误在跨模型族中呈现收敛趋势(因果计数减少51%),表明多标签因果推理中存在系统性而非模型特定的失效模式。