Political speakers often avoid answering questions directly while maintaining the appearance of responsiveness. Despite its importance for public discourse, such strategic evasion remains underexplored in Natural Language Processing. We introduce SemEval-2026 Task 6, CLARITY, a shared task on political question evasion consisting of two subtasks: (i) clarity-level classification into Clear Reply, Ambivalent, and Clear Non-Reply, and (ii) evasion-level classification into nine fine-grained evasion strategies. The benchmark is constructed from U.S. presidential interviews and follows an expert-grounded taxonomy of response clarity and evasion. The task attracted 124 registered teams, who submitted 946 valid runs for clarity-level classification and 539 for evasion-level classification. Results show a substantial gap in difficulty between the two subtasks: the best system achieved 0.89 macro-F1 on clarity classification, surpassing the strongest baseline by a large margin, while the top evasion-level system reached 0.68 macro-F1, matching the best baseline. Overall, large language model prompting and hierarchical exploitation of the taxonomy emerged as the most effective strategies, with top systems consistently outperforming those that treated the two subtasks independently. CLARITY establishes political response evasion as a challenging benchmark for computational discourse analysis and highlights the difficulty of modeling strategic ambiguity in political language.
翻译:政治发言者常常在保持表面回应性的同时,避免直接回答问题。尽管这种行为对公共话语至关重要,但此类策略性回避在自然语言处理领域仍未得到充分探索。我们推出了SemEval-2026任务6,即CLARITY,这是一个关于政治问题回避的共享任务,包含两个子任务:(i)将回答的清晰度分类为“明确回复”、“模棱两可”和“明确不回复”;(ii)将回避程度细分为九种具体的回避策略进行分类。该基准数据集构建自美国总统访谈记录,并遵循基于专家知识的回答清晰度与回避策略分类体系。该任务吸引了124支注册团队,提交了946份有效的清晰度分类运行结果和539份回避策略分类运行结果。结果显示两个子任务在难度上存在显著差距:最佳系统在清晰度分类上达到了0.89的宏平均F1分数,大幅超越了最强基线系统;而顶级的回避策略分类系统达到了0.68的宏平均F1分数,与最佳基线系统持平。总体而言,大语言模型提示技术和对分类体系的层次化利用被证明是最有效的策略,表现最佳的系统持续优于那些将两个子任务独立处理的系统。CLARITY将政治回应回避确立为计算话语分析的一个具有挑战性的基准,并凸显了在政治语言中建模策略性模糊的难度。