In the era of large AI models, the complex architecture and vast parameters present substantial challenges for effective AI quality management (AIQM), e.g. large language model (LLM). This paper focuses on investigating the quality assurance of a specific LLM-based AI product--a ChatGPT-based sentiment analysis system. The study delves into stability issues related to both the operation and robustness of the expansive AI model on which ChatGPT is based. Experimental analysis is conducted using benchmark datasets for sentiment analysis. The results reveal that the constructed ChatGPT-based sentiment analysis system exhibits uncertainty, which is attributed to various operational factors. It demonstrated that the system also exhibits stability issues in handling conventional small text attacks involving robustness.
翻译:在大规模AI模型时代,复杂的架构和庞大的参数对有效的AI质量管理(AIQM)构成了重大挑战,例如大型语言模型(LLM)。本文着重研究基于LLM的特定AI产品——基于ChatGPT的情感分析系统的质量保证。研究深入探讨了ChatGPT所依托的扩展性AI模型在运行和鲁棒性方面的稳定性问题。使用情感分析基准数据集进行实验分析。结果表明,构建的基于ChatGPT的情感分析系统存在不确定性,这归因于多种运行因素。同时证明,该系统在处理涉及鲁棒性的常规小型文本攻击时也表现出稳定性问题。