Fetal ultrasound AI could transform prenatal care in low-resource settings, yet current foundation models exceed 300M visual parameters, precluding deployment on point-of-care devices. Standard knowledge distillation fails under such extreme capacity gaps (~26x), as compact students waste capacity mimicking architectural artifacts of oversized teachers. We introduce Selective Repulsive Knowledge Distillation, which decomposes contrastive KD into diagonal and off-diagonal components: matched pair alignment is preserved while the off-diagonal weight decays into negative values, repelling the student from the teacher's inter-class confusions and forcing discovery of architecturally native features. Our 11.4M parameter student surpasses the 304M-parameter FetalCLIP teacher on zero-shot HC18 biometry validity (88.6% vs. 83.5%) and brain sub-plane F1 (0.784 vs. 0.702), while running at 1.6 ms on iPhone 16 Pro, enabling real-time assistive AI on handheld ultrasound devices. Our code, models, and app are publicly available at https://github.com/numanai/MobileFetalCLIP.
翻译:胎儿超声人工智能有望在资源匮乏地区变革产前护理,然而当前基础模型参数量超过3亿视觉参数,无法在即时诊断设备上部署。标准知识蒸馏在此类极端容量差距(约26倍)下失效,因为紧凑的学生模型会浪费容量模仿超大教师模型的结构性伪影。我们提出选择性排斥知识蒸馏,将对比知识蒸馏分解为对角与非对角分量:在保持匹配样本对对齐的同时,非对角权重衰减为负值,使学生模型排斥教师模型的类间混淆,强制其发现架构原生特征。我们的1140万参数学生模型在零样本HC18生物测量有效性(88.6% vs. 83.5%)和脑部切面子平面F1分数(0.784 vs. 0.702)上超越了3.04亿参数的FetalCLIP教师模型,同时在iPhone 16 Pro上实现1.6毫秒推理速度,为手持超声设备提供了实时辅助人工智能能力。我们的代码、模型及应用已在https://github.com/numanai/MobileFetalCLIP公开。