Stochastic games are a well established model for multi-agent sequential decision making under uncertainty. In practical applications, though, agents often have only partial observability of their environment. Furthermore, agents increasingly perceive their environment using data-driven approaches such as neural networks trained on continuous data. We propose the model of neuro-symbolic partially-observable stochastic games (NS-POSGs), a variant of continuous-space concurrent stochastic games that explicitly incorporates neural perception mechanisms. We focus on a one-sided setting with a partially-informed agent using discrete, data-driven observations and another, fully-informed agent. We present a new method, called one-sided NS-HSVI, for approximate solution of one-sided NS-POSGs, which exploits the piecewise constant structure of the model. Using neural network pre-image analysis to construct finite polyhedral representations and particle-based representations for beliefs, we implement our approach and illustrate its practical applicability to the analysis of pedestrian-vehicle and pursuit-evasion scenarios.
翻译:随机博弈是用于不确定性下多智能体序贯决策的成熟模型。然而在实际应用中,智能体通常仅能部分观测其环境。此外,智能体越来越多地采用数据驱动方法(例如基于连续数据训练的神经网络)来感知环境。我们提出了神经符号部分可观测随机博弈模型,这是连续空间并发随机博弈的一种变体,其显式地融合了神经感知机制。我们聚焦于单侧设定场景:一个智能体采用离散的数据驱动观测方式且仅具有部分信息,而另一个智能体则具备完全信息。我们提出了一种名为单侧NS-HSVI的新方法,用于近似求解单侧NS-POSGs,该方法利用了模型的分段常数结构。通过运用神经网络原像分析来构建有限多面体表示以及基于粒子的信念表示,我们实现了所提出的方法,并展示了其在行人-车辆交互场景及追逃博弈分析中的实际适用性。