Sexist content online increasingly appears in subtle, context-dependent forms that evade traditional detection methods. Its interpretation often depends on overlapping linguistic, psychological, legal, and cultural dimensions, which produce mixed and sometimes contradictory signals, even in annotated datasets. These inconsistencies, combined with label scarcity and class imbalance, result in unstable decision boundaries and cause fine-tuned models to overlook subtler, underrepresented forms of harm. Together, these limitations point to the need for a design that explicitly addresses the combined effects of (i) underrepresentation, (ii) noise, and (iii) conceptual ambiguity in both data and model predictions. To address these challenges, we propose a two-stage framework that unifies (i) targeted training procedures to adapt supervision to scarce and noisy data with (ii) selective, reasoning-based inference to handle ambiguous or borderline cases. Our training setup applies class-balanced focal loss, class-aware batching, and post-hoc threshold calibration to mitigate label imbalance and noisy supervision. At inference time, a dynamic routing mechanism classifies high-confidence cases directly and escalates uncertain instances to a novel \textit{Collaborative Expert Judgment} (CEJ) module, which prompts multiple personas and consolidates their reasoning through a judge model. Our approach achieves state-of-the-art results across several benchmarks, with F1 gains of +4.48% and +1.30% on EDOS Tasks A and B, respectively, and a +2.79% improvement in ICM on EXIST 2025 Task 1.1.
翻译:网络中的性别歧视内容日益呈现出微妙且依赖语境的形式,规避了传统检测方法。其解释通常涉及相互交织的语言学、心理学、法学及文化维度,这些维度即使在已标注数据集中也会产生混杂甚至矛盾的信号。这些不一致性,加之标签稀缺和类别不平衡,导致了不稳定的决策边界,并使微调后的模型忽略了更微妙、代表性不足的伤害形式。这些局限共同指向了一种设计需求,即需要明确应对数据与模型预测中(i)代表性不足、(ii)噪声及(iii)概念模糊性这三者的综合影响。为应对这些挑战,我们提出了一个两阶段框架,该框架将(i)针对稀缺和噪声数据调整监督的定向训练流程与(ii)基于推理的选择性推断机制相结合,以处理模糊或边界情况。我们的训练设置采用类别平衡焦点损失、类别感知批处理及事后阈值校准,以缓解标签不平衡和噪声监督。在推理阶段,一个动态路由机制直接分类高置信度案例,并将不确定实例提交至新颖的\textit{协作专家判断}模块;该模块激活多个角色,并通过一个法官模型整合其推理过程。我们的方法在多个基准测试中取得了最先进的结果,在EDOS任务A和B上分别实现了+4.48%和+1.30%的F1分数提升,并在EXIST 2025任务1.1上获得了+2.79%的ICM改进。