Foundation models are increasingly personalized on decentralized private data through federated learning and are now deployed at scale under growing regulatory requirements for post-market monitoring. We argue that this convergence creates a distinct and under-recognized class of trustworthiness failures, which we term "Silent Failures." These include amplified bias, fairness collapse, and alignment erosion that may remain difficult to detect because federated learning's privacy constraints limit visibility into model behavior. A landscape analysis of existing benchmarks reveals a structural divide. Federated benchmarks evaluate system performance but provide limited insight into model behavior, whereas centralized trustworthiness benchmarks assess behavior but require model access incompatible with federated privacy. We introduce a taxonomy of six silent failure modes arising from the interaction of foundation model personalization, dataset shift, and core federated constraints. Our analysis shows that privacy-preserving training alone is insufficient for trustworthy deployment. We conclude with a research agenda for privacy-preserving behavioral evaluation and propose that silent failures become a standard diagnostic category for trustworthy federated artificial intelligence.
翻译:基础模型正通过联邦学习在分散的私有数据上实现个性化,并日益在监管机构对上市后监控要求不断升级的背景下大规模部署。我们认为这种技术融合产生了一类独特且未被充分认知的可信度失效现象,我们称之为"静默失效"。这些失效包括放大的偏见、公平性崩塌和对齐侵蚀等,因联邦学习的隐私约束限制了对模型行为的可见性而难以被检测。对现有基准测试的格局分析揭示了结构性分裂:联邦基准评估系统性能但难以洞察模型行为,而集中式可信度基准虽能评估行为却要求与联邦隐私不相容的模型访问权限。我们提出了一个六类静默失效模式的分类体系,这些失效源于基础模型个性化、数据集偏移与联邦核心约束的相互作用。分析表明,仅靠隐私保护训练不足以实现可信部署。最后,我们提出了隐私保护行为评估的研究议程,并建议将静默失效作为可信联邦人工智能的标准诊断类别。