CoHalLo：基于隐藏层向量探测的代码幻觉定位 (CoHalLo: code hallucination localization via probing hidden layer vector)

The localization of code hallucinations aims to identify specific lines of code containing hallucinations, helping developers to improve the reliability of AI-generated code more efficiently. Although recent studies have adopted several methods to detect code hallucination, most of these approaches remain limited to coarse-grained detection and lack specialized techniques for fine-grained hallucination localization. This study introduces a novel method, called CoHalLo, which achieves line-level code hallucination localization by probing the hidden-layer vectors from hallucination detection models. CoHalLo uncovers the key syntactic information driving the model's hallucination judgments and locates the hallucinating code lines accordingly. Specifically, we first fine-tune the hallucination detection model on manually annotated datasets to ensure that it learns features pertinent to code syntactic information. Subsequently, we designed a probe network that projects high-dimensional latent vectors onto a low-dimensional syntactic subspace, generating vector tuples and reconstructing the predicted abstract syntax tree (P-AST). By comparing P-AST with the original abstract syntax tree (O-AST) extracted from the input AI-generated code, we identify the key syntactic structures associated with hallucinations. This information is then used to pinpoint hallucinated code lines. To evaluate CoHalLo's performance, we manually collected a dataset of code hallucinations. The experimental results show that CoHalLo achieves a Top-1 accuracy of 0.4253, Top-3 accuracy of 0.6149, Top-5 accuracy of 0.7356, Top-10 accuracy of 0.8333, IFA of 5.73, Recall@1% Effort of 0.052721, and Effort@20% Recall of 0.155269, which outperforms the baseline methods.

翻译：代码幻觉定位旨在识别包含幻觉的具体代码行，以帮助开发者更高效地提升AI生成代码的可靠性。尽管近期研究已采用多种方法检测代码幻觉，但这些方法大多仍局限于粗粒度检测，缺乏针对细粒度幻觉定位的专门技术。本研究提出了一种名为CoHalLo的新方法，该方法通过探测幻觉检测模型的隐藏层向量，实现了行级代码幻觉定位。CoHalLo揭示了驱动模型进行幻觉判断的关键语法信息，并据此定位产生幻觉的代码行。具体而言，我们首先在人工标注的数据集上对幻觉检测模型进行微调，以确保其学习到与代码语法信息相关的特征。随后，我们设计了一个探测网络，将高维潜在向量投影到低维语法子空间，生成向量元组并重构预测的抽象语法树（P-AST）。通过将P-AST与从输入的AI生成代码中提取的原始抽象语法树（O-AST）进行比较，我们识别出与幻觉相关的关键语法结构。此信息随后用于精确定位产生幻觉的代码行。为评估CoHalLo的性能，我们手动收集了一个代码幻觉数据集。实验结果表明，CoHalLo在Top-1准确率上达到0.4253，Top-3准确率为0.6149，Top-5准确率为0.7356，Top-10准确率为0.8333，IFA为5.73，Recall@1% Effort为0.052721，Effort@20% Recall为0.155269，其性能优于基线方法。