Public agencies are beginning to consider large language models (LLMs) as decision-support tools for grant evaluation. This creates a practical governance problem: the model and scoring rubric should not be exposed in a way that allows applicants to optimize against them, yet the evaluation process must remain auditable, contestable, and accountable. We propose a TEE-based architecture that helps reconcile these requirements through remote attestation. The architecture allows an external verifier to check which model, rubric, prompt template, and input representation were used, without exposing model weights, proprietary scoring logic, or intermediate reasoning to applicants or infrastructure operators. The main artifact is an attested evaluation bundle: a signed, timestamped record linking the original submission hash, the canonical input hash, the model-and-rubric measurement, and the evaluation output. The paper also considers a scenario-specific prompt injection risk: applicant-controlled documents may contain hidden or indirect instructions intended to influence the LLM evaluator. We therefore include a canonicalization and sanitization layer that normalizes document representations and records suspicious transformations before inference. We position the design relative to confidential AI inference, attestable AI audits, zero-knowledge machine learning, algorithmic accountability, and AI-assisted peer review. The resulting claim is deliberately narrow: remote attestation does not prove that an evaluation is fair or scientifically correct, but it can make part of the evaluation process externally verifiable.
翻译:公共机构开始考虑将大型语言模型(LLMs)作为基金评审的决策支持工具。这产生了一个实际的治理问题:模型和评分标准不应暴露,以免申请人能针对性地优化,但评审过程必须保持可审计、可质询和可问责。我们提出了一种基于TEE的架构,通过远程认证帮助协调这些需求。该架构允许外部验证者检查使用了哪些模型、评分标准、提示模板和输入表示,而无需向申请人或基础设施运营者暴露模型权重、专有评分逻辑或中间推理过程。主要产出物是经认证的评估包:一份带有签名和时间戳的记录,链接原始提交哈希值、规范输入哈希值、模型与评分标准测量值以及评估输出。本文还考虑了一种场景特定的提示注入风险:申请人控制的文档可能包含旨在影响LLM评估器的隐藏或间接指令。因此,我们引入了一个规范化与净化层,用于规范化文档表示并在推理前记录可疑的变换。我们将此设计定位在机密AI推理、可认证的AI审计、零知识机器学习、算法问责制和AI辅助同行评审的背景下。最终主张刻意保守:远程认证并不能证明评估是公平或科学正确的,但可以使部分评估过程可外部验证。