ChatGPT is a recent chatbot service released by OpenAI and is receiving increasing attention over the past few months. While evaluations of various aspects of ChatGPT have been done, its robustness, i.e., the performance to unexpected inputs, is still unclear to the public. Robustness is of particular concern in responsible AI, especially for safety-critical applications. In this paper, we conduct a thorough evaluation of the robustness of ChatGPT from the adversarial and out-of-distribution (OOD) perspective. To do so, we employ the AdvGLUE and ANLI benchmarks to assess adversarial robustness and the Flipkart review and DDXPlus medical diagnosis datasets for OOD evaluation. We select several popular foundation models as baselines. Results show that ChatGPT shows consistent advantages on most adversarial and OOD classification and translation tasks. However, the absolute performance is far from perfection, which suggests that adversarial and OOD robustness remains a significant threat to foundation models. Moreover, ChatGPT shows astounding performance in understanding dialogue-related texts and we find that it tends to provide informal suggestions for medical tasks instead of definitive answers. Finally, we present in-depth discussions of possible research directions.
翻译:ChatGPT是OpenAI最新发布的聊天机器人服务,近几个月来受到日益广泛的关注。尽管已有研究从多个维度评估ChatGPT的性能,但其鲁棒性——即面对意外输入时的表现,对公众而言仍不明确。鲁棒性是负责任人工智能领域的关键问题,尤其在安全关键应用中更为突出。本文从对抗性与分布外(OOD)视角对ChatGPT的鲁棒性进行了全面评估。为此,我们采用AdvGLUE和ANLI基准测试评估对抗鲁棒性,利用Flipkart评论和DDXPlus医疗诊断数据集进行OOD评估,并选取多个主流基础模型作为基线。结果表明:ChatGPT在大多数对抗性与OOD分类及翻译任务中表现出持续优势,但绝对性能远未达到完美,这表明对抗性与OOD鲁棒性仍是基础模型面临的重大威胁。此外,ChatGPT在理解对话相关文本方面展现出惊人性能,且我们发现其在医疗任务中更倾向于提供非正式建议而非确定性答案。最后,我们深入讨论了可能的未来研究方向。