ChatGPT is a recent chatbot service released by OpenAI and is receiving increasing attention over the past few months. While evaluations of various aspects of ChatGPT have been done, its robustness, i.e., the performance to unexpected inputs, is still unclear to the public. Robustness is of particular concern in responsible AI, especially for safety-critical applications. In this paper, we conduct a thorough evaluation of the robustness of ChatGPT from the adversarial and out-of-distribution (OOD) perspective. To do so, we employ the AdvGLUE and ANLI benchmarks to assess adversarial robustness and the Flipkart review and DDXPlus medical diagnosis datasets for OOD evaluation. We select several popular foundation models as baselines. Results show that ChatGPT shows consistent advantages on most adversarial and OOD classification and translation tasks. However, the absolute performance is far from perfection, which suggests that adversarial and OOD robustness remains a significant threat to foundation models. Moreover, ChatGPT shows astounding performance in understanding dialogue-related texts and we find that it tends to provide informal suggestions for medical tasks instead of definitive answers. Finally, we present in-depth discussions of possible research directions.
翻译:ChatGPT是OpenAI近期发布的对话式聊天机器人服务,近几个月来受到日益广泛的关注。尽管已有研究从多维度评估了ChatGPT的性能表现,但其鲁棒性——即面对非预期输入时的稳定性——尚未被公众充分认知。鲁棒性在可解释人工智能领域具有特殊重要性,尤其对安全敏感型应用而言。本文从对抗性与分布外(OOD)视角对ChatGPT的鲁棒性进行了系统性评估。具体而言,我们采用AdvGLUE和ANLI基准测试对抗鲁棒性,并利用Flipkart评论数据集与DDXPlus医疗诊断数据集进行OOD评估。选取若干主流基础模型作为基线进行比较。结果表明,ChatGPT在多数对抗性与OOD分类及翻译任务中保持显著优势。然而其绝对性能尚未达到理想水平,说明对抗性与OOD鲁棒性仍是基础模型面临的重大威胁。此外,ChatGPT在对话相关文本理解方面表现惊人,但研究发现其在医学任务中倾向于提供非正式建议而非确定性答案。最后,本文对潜在研究方向进行了深度探讨与展望。