Uncertainty quantification approaches have been more critical in large language models (LLMs), particularly high-risk applications requiring reliable outputs. However, traditional methods for uncertainty quantification, such as probabilistic models and ensemble techniques, face challenges when applied to the complex and high-dimensional nature of LLM-generated outputs. This study proposes a novel geometric approach to uncertainty quantification using convex hull analysis. The proposed method leverages the spatial properties of response embeddings to measure the dispersion and variability of model outputs. The prompts are categorized into three types, i.e., `easy', `moderate', and `confusing', to generate multiple responses using different LLMs at varying temperature settings. The responses are transformed into high-dimensional embeddings via a BERT model and subsequently projected into a two-dimensional space using Principal Component Analysis (PCA). The Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm is utilized to cluster the embeddings and compute the convex hull for each selected cluster. The experimental results indicate that the uncertainty of the model for LLMs depends on the prompt complexity, the model, and the temperature setting.
翻译:不确定性量化方法在大语言模型(LLMs)中愈发关键,尤其是在需要可靠输出的高风险应用中。然而,传统的概率模型和集成技术等不确定性量化方法,在应用于LLM生成输出的复杂高维特性时面临挑战。本研究提出了一种利用凸包分析进行不确定性量化的新颖几何方法。该方法利用响应嵌入的空间特性来度量模型输出的离散度和变异性。提示被分为三类,即“简单”、“中等”和“混淆”,以在不同温度设置下使用不同LLMs生成多个响应。这些响应通过BERT模型转换为高维嵌入,随后使用主成分分析(PCA)投影到二维空间。采用基于密度的噪声应用空间聚类(DBSCAN)算法对嵌入进行聚类,并为每个选定聚类计算凸包。实验结果表明,LLMs模型的不确定性取决于提示复杂度、模型本身以及温度设置。