Many data systems admit multiple admissible outcomes for the same input: concurrent transactions may serialize in one of many orders; a logic program may have multiple stable models. Classical data provenance cannot even pose its question in such settings -- it explains how a result was derived, but only after something has chosen which result to produce. We introduce \emph{determination provenance} to track the commitments that resolve this ambiguity. A tuple's \emph{support} is the set of resolutions under which it holds. Supports form a commutative semiring, and layered commitments induce a \emph{filtration} measuring each tuple's \emph{query-relative depth} -- how many layers of semantic resolution it depends on. Positive relational algebra respects the filtration, enabling compositional robustness analysis and quantitative diagnosis of resolution cost. We instantiate the framework for transactional isolation and for $\mbox{Datalog}^\neg$; in both, classical semantic variants (isolation levels; negation semantics) correspond to different views of a single shared filtration.
翻译:许多数据系统对同一输入允许多种可接受的结果:并发事务可以按多种顺序之一序列化;逻辑程序可能有多个稳定模型。经典的数据溯源甚至无法在这样的场景中提出问题——它解释结果是如何推导出来的,但仅在某物决定了产生哪个结果之后才起作用。我们引入**判定溯源**来追踪解决这种不确定性的承诺。一个元组的**支撑集**是使其成立的判定方案的集合。支撑集构成一个可交换半环,而分层承诺则导出一个**滤链**,用于度量每个元组的**查询相对深度**——即它依赖的语义判定层数。正关系代数尊重滤链,从而能够实现组合鲁棒性分析和判定代价的定量诊断。我们将该框架实例化应用于事务隔离和$\mbox{Datalog}^\neg$两者;在这两个案例中,经典的语义变体(隔离级别;否定语义)对应了同一共享滤链的不同视角。