Evaluation of intervention in a multiagent system, e.g., when humans should intervene in autonomous driving systems and when a player should pass to teammates for a good shot, is challenging in various engineering and scientific fields. Estimating the individual treatment effect (ITE) using counterfactual long-term prediction is practical to evaluate such interventions. However, most of the conventional frameworks did not consider the time-varying complex structure of multiagent relationships and covariate counterfactual prediction. This may lead to erroneous assessments of ITE and difficulty in interpretation. Here we propose an interpretable, counterfactual recurrent network in multiagent systems to estimate the effect of the intervention. Our model leverages graph variational recurrent neural networks and theory-based computation with domain knowledge for the ITE estimation framework based on long-term prediction of multiagent covariates and outcomes, which can confirm the circumstances under which the intervention is effective. On simulated models of an automated vehicle and biological agents with time-varying confounders, we show that our methods achieved lower estimation errors in counterfactual covariates and the most effective treatment timing than the baselines. Furthermore, using real basketball data, our methods performed realistic counterfactual predictions and evaluated the counterfactual passes in shot scenarios.
翻译:在多智能体系统中评估干预措施(例如,人类应何时干预自动驾驶系统,以及球员应何时传球给队友以获得良好投篮机会)对于众多工程和科学领域而言具有挑战性。利用反事实长期预测来估计个体处理效应(ITE)是评估此类干预措施的有效方法。然而,大多数传统框架未考虑多智能体关系的时变复杂结构以及协变量的反事实预测。这可能导致对ITE的错误评估和解释困难。本文提出了一种多智能体系统中可解释的反事实循环网络,用于估计干预效果。我们的模型利用图变分循环神经网络和基于领域知识的理论计算,构建了基于多智能体协变量和结果长期预测的ITE估计框架,从而能够确认干预措施有效的情境。在具有时变混杂因素的自动驾驶车辆和生物智能体的模拟模型上,我们证明所提方法在反事实协变量估计误差和最佳处理时机方面均优于基线方法。此外,利用真实篮球数据,我们的方法实现了逼真的反事实预测,并评估了投篮场景中的反事实传球。