The rapid progress of Large Models (LMs) has recently revolutionized various fields of deep learning with remarkable grades, ranging from Natural Language Processing (NLP) to Computer Vision (CV). However, LMs are increasingly challenged and criticized by academia and industry due to their powerful performance but untrustworthy behavior, which urgently needs to be alleviated by reliable methods. Despite the abundance of literature on trustworthy LMs in NLP, a systematic survey specifically delving into the trustworthiness of LMs in CV remains absent. In order to mitigate this gap, we summarize four relevant concerns that obstruct the trustworthy usage in vision of LMs in this survey, including 1) human misuse, 2) vulnerability, 3) inherent issue and 4) interpretability. By highlighting corresponding challenge, countermeasures, and discussion in each topic, we hope this survey will facilitate readers' understanding of this field, promote alignment of LMs with human expectations and enable trustworthy LMs to serve as welfare rather than disaster for human society.
翻译:大模型在深度学习各领域(从自然语言处理到计算机视觉)近期取得了突破性进展。然而,因其性能强大但行为不可信,大模型正日益受到学术界和工业界的质疑与批评,亟待通过可靠方法加以改善。尽管关于自然语言处理领域大模型可信度的文献已相当丰富,但系统探讨计算机视觉中大模型可信度的综述仍存在空白。为弥补这一不足,本综述从四个关键维度总结阻碍大模型在视觉领域可信应用的现存问题:1)人类滥用、2)脆弱性、3)固有问题、4)可解释性。通过阐明各主题对应的挑战、对策与讨论,期望本综述能帮助读者理解该领域,推动大模型与人类期望对齐,使其成为造福人类社会的福祉而非灾难。