On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective

Yue Huang,Chujie Gao,Siyuan Wu,Haoran Wang,Xiangqi Wang,Yujun Zhou,Yanbo Wang,Jiayi Ye,Jiawen Shi,Qihui Zhang,Yuan Li,Han Bao,Zhaoyi Liu,Tianrui Guan,Dongping Chen,Ruoxi Chen,Kehan Guo,Andy Zou,Bryan Hooi Kuen-Yew,Caiming Xiong,Elias Stengel-Eskin,Hongyang Zhang,Hongzhi Yin,Huan Zhang,Huaxiu Yao,Jaehong Yoon,Jieyu Zhang,Kai Shu,Kaijie Zhu,Ranjay Krishna,Swabha Swayamdipta,Taiwei Shi,Weijia Shi,Xiang Li,Yiwei Li,Yuexing Hao,Zhihao Jia,Zhize Li,Xiuying Chen,Zhengzhong Tu,Xiyang Hu,Tianyi Zhou,Jieyu Zhao,Lichao Sun,Furong Huang,Or Cohen Sasson,Prasanna Sattigeri,Anka Reuel,Max Lamparth,Yue Zhao,Nouha Dziri,Yu Su,Huan Sun,Heng Ji,Chaowei Xiao,Mohit Bansal,Nitesh V. Chawla,Jian Pei,Jianfeng Gao,Michael Backes,Philip S. Yu,Neil Zhenqiang Gong,Pin-Yu Chen,Bo Li,Dawn Song,Xiangliang Zhang

Generative Foundation Models (GenFMs) have emerged as transformative tools. However, their widespread adoption raises critical concerns regarding trustworthiness across dimensions. This paper presents a comprehensive framework to address these challenges through three key contributions. First, we systematically review global AI governance laws and policies from governments and regulatory bodies, as well as industry practices and standards. Based on this analysis, we propose a set of guiding principles for GenFMs, developed through extensive multidisciplinary collaboration that integrates technical, ethical, legal, and societal perspectives. Second, we introduce TrustGen, the first dynamic benchmarking platform designed to evaluate trustworthiness across multiple dimensions and model types, including text-to-image, large language, and vision-language models. TrustGen leverages modular components--metadata curation, test case generation, and contextual variation--to enable adaptive and iterative assessments, overcoming the limitations of static evaluation methods. Using TrustGen, we reveal significant progress in trustworthiness while identifying persistent challenges. Finally, we provide an in-depth discussion of the challenges and future directions for trustworthy GenFMs, which reveals the complex, evolving nature of trustworthiness, highlighting the nuanced trade-offs between utility and trustworthiness, and consideration for various downstream applications, identifying persistent challenges and providing a strategic roadmap for future research. This work establishes a holistic framework for advancing trustworthiness in GenAI, paving the way for safer and more responsible integration of GenFMs into critical applications. To facilitate advancement in the community, we release the toolkit for dynamic evaluation.

翻译：生成式基础模型已成为变革性工具，但其广泛应用引发了对其多维度可靠性的关键关切。本文通过三项核心贡献提出应对这些挑战的综合框架。首先，我们系统梳理了各国政府与监管机构的全球人工智能治理法规政策，以及行业实践与标准。基于此分析，我们提出一套通过融合技术、伦理、法律与社会视角的多学科广泛协作制定的生成式基础模型指导原则。其次，我们推出首个动态基准测试平台TrustGen，用于评估涵盖文本到图像、大语言及视觉语言模型等多维度与模型类型的可靠性。TrustGen利用元数据策展、测试用例生成和上下文变异三大模块化组件，实现自适应迭代评估，克服了静态评估方法的局限性。通过TrustGen的实证分析，我们揭示了可靠性方面的显著进展，同时识别出持续存在的挑战。最后，我们深入探讨了可信生成式基础模型面临的挑战与未来方向，揭示了可靠性复杂演变的本质，阐明了效用与可靠性之间微妙的权衡关系，考量了各类下游应用场景，识别出持续性挑战并为未来研究提供战略路线图。本研究建立了推进生成式人工智能可靠性的整体框架，为生成式基础模型更安全、更负责任地融入关键应用铺平道路。为促进学界发展，我们同步开源动态评估工具包。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日