We study active perception from first principles to argue that an autonomous agent performing active perception should maximize the mutual information that past observations posses about future ones. Doing so requires (a) a representation of the scene that summarizes past observations and the ability to update this representation to incorporate new observations (state estimation and mapping), (b) the ability to synthesize new observations of the scene (a generative model), and (c) the ability to select control trajectories that maximize predictive information (planning). This motivates a neural radiance field (NeRF)-like representation which captures photometric, geometric and semantic properties of the scene grounded. This representation is well-suited to synthesizing new observations from different viewpoints. And thereby, a sampling-based planner can be used to calculate the predictive information from synthetic observations along dynamically-feasible trajectories. We use active perception for exploring cluttered indoor environments and employ a notion of semantic uncertainty to check for the successful completion of an exploration task. We demonstrate these ideas via simulation in realistic 3D indoor environments.
翻译:我们从第一性原理研究主动感知,认为执行主动感知的自主智能体应最大化过去观测关于未来观测的互信息。实现这一目标需要:(a)能够总结过去观测的场景表示,并具备将新观测融入该表示的更新能力(状态估计与建图);(b)能够合成场景新观测的能力(生成模型);(c)能够选择最大化预测信息的控制轨迹的能力(规划)。这促使我们采用类似神经辐射场(NeRF)的表示方法,该表示能够捕捉场景中基于物理的光度、几何与语义属性。该表示特别适合从不同视角合成新观测。由此,基于采样的规划器可用于沿动力学可行轨迹计算合成观测的预测信息。我们将主动感知应用于杂乱室内环境的探索,并采用语义不确定性概念来检测探索任务的成功完成。通过真实感三维室内环境的仿真实验,我们验证了这些方法的有效性。