Principled accountability in the aftermath of harms is essential to the trustworthy design and governance of algorithmic decision making. Legal philosophy offers a paramount method for assessing culpability: putting the agent 'on the stand' to subject their actions and intentions to cross-examination. We show that under minimal assumptions automated reasoning can rigorously interrogate algorithmic behaviors as in the adversarial process of legal fact finding. We model accountability processes, such as trials or review boards, as Counterfactual-Guided Logic Exploration and Abstraction Refinement (CLEAR) loops. We use an SMT-based oracle to discharge queries about agent behavior in factual and counterfactual scenarios, as adaptively formulated by a human investigator. For a decision algorithm $\mathcal{A}$, we use symbolic execution to represent its logic as a statement $\Pi$ in the decidable theory $\texttt{QF_FPBV}$. We implement our framework in a tool called $\textsf{soid}$ with an accompanying GUI, and demonstrate its utility on an illustrative car crash scenario.
翻译:在损害发生后进行原则性的问责,对于算法决策的可信设计与治理至关重要。法律哲学提供了一种评估罪责的重要方法:将行为主体“置于证人席”,对其行为与意图进行交叉审查。我们证明,在最小假设下,自动化推理能够像法律事实认定的对抗过程一样,严格质询算法行为。我们将审判或审查委员会等问责过程建模为反事实引导的逻辑探索与抽象精化(CLEAR)循环。我们使用基于SMT的oracle来回答人类调查员自适应提出的关于事实与反事实场景下行为主体问题的查询。对于决策算法$\mathcal{A}$,我们使用符号执行将其逻辑表示为可判定理论$\texttt{QF_FPBV}$中的语句$\Pi$。我们在名为$\textsf{soid}$的工具中实现了我们的框架,并附带图形用户界面,通过一个说明性的汽车碰撞场景展示了其实用性。