In current inter-organizational data spaces, usage policies are enforced mainly at the asset level: a whole document or dataset is either shared or withheld. When only parts of a document are sensitive, providers who want to avoid leaking protected information typically must manually redact documents before sharing them, which is costly, coarse-grained, and hard to maintain as policies or partners change. We present DAVE, a usage policy-enforcing LLM spokesperson that answers questions over private documents on behalf of a data provider. Instead of releasing documents, the provider exposes a natural language interface whose responses are constrained by machine-readable usage policies. We formalize policy-violating information disclosure in this setting, drawing on usage control and information flow security, and introduce virtual redaction: suppressing sensitive information at query time without modifying source documents. We describe an architecture for integrating such a spokesperson with Eclipse Dataspace Components and ODRL-style policies, and outline an initial provider-side integration prototype in which QA requests are routed through a spokesperson service instead of triggering raw document transfer. Our contribution is primarily architectural: we do not yet implement or empirically evaluate the full enforcement pipeline. We therefore outline an evaluation methodology to assess security, utility, and performance trade-offs under benign and adversarial querying as a basis for future empirical work on systematically governed LLM access to multi-party data spaces.
翻译:在当前跨组织数据空间中,使用策略主要在资产层面执行:整个文档或数据集要么被共享,要么被保留。当文档中仅部分内容敏感时,为避免泄露受保护信息,提供者通常必须在共享前手动编辑文档,这种做法成本高昂、粒度粗糙,且难以随策略或合作伙伴的变化而维护。我们提出DAVE,一种使用策略执行型LLM发言人,代表数据提供者回答关于私有文档的问题。提供者不直接发布文档,而是暴露一个自然语言接口,其响应受机器可读使用策略的约束。我们借鉴使用控制和信息流安全理论,形式化了此场景下违反策略的信息披露,并引入虚拟编辑:在查询时抑制敏感信息,而无需修改源文档。我们描述了一种将此类发言人与Eclipse Dataspace Components和ODRL风格策略集成的架构,并概述了一个初步的提供者侧集成原型,其中问答请求通过发言人服务路由,而非触发原始文档传输。我们的贡献主要是架构性的:我们尚未实现或实证评估完整的执行流程。因此,我们概述了一种评估方法,以衡量在良性和对抗性查询下的安全性、效用和性能权衡,为未来关于多方数据空间中系统化管理的LLM访问的实证研究奠定基础。