Intrinsically motivated exploration has proven useful for reinforcement learning, even without additional extrinsic rewards. When the environment is naturally represented as a graph, how to guide exploration best remains an open question. In this work, we propose a novel approach for exploring graph-structured data motivated by two theories of human curiosity: the information gap theory and the compression progress theory. The theories view curiosity as an intrinsic motivation to optimize for topological features of subgraphs induced by the visited nodes in the environment. We use these proposed features as rewards for graph neural-network-based reinforcement learning. On multiple classes of synthetically generated graphs, we find that trained agents generalize to larger environments and to longer exploratory walks than are seen during training. Our method computes more efficiently than the greedy evaluation of the relevant topological properties. The proposed intrinsic motivations bear particular relevance for recommender systems. We demonstrate that curiosity-based recommendations are more predictive of human behavior than PageRank centrality for several real-world graph datasets, including MovieLens, Amazon Books, and Wikispeedia.
翻译:内在动机驱动的探索已被证明对强化学习有效,即使没有额外外部奖励。当环境自然表示为图时,如何最好地引导探索仍是一个开放问题。本文提出了一种新颖的图结构数据探索方法,该方法受两种人类好奇心理论启发:信息差距理论和压缩进展理论。这些理论将好奇心视为内在动机,以优化由环境中已访问节点诱导的子图的拓扑特征。我们利用这些特征作为基于图神经网络的强化学习的奖励。在多个类别的人工合成图上,我们发现训练后的智能体能够泛化到比训练时更大的环境和更长的探索路径。与相关拓扑属性的贪婪评估相比,我们的方法计算效率更高。所提出的内在动机对推荐系统具有特别意义。我们证明,对于多个真实世界图数据集(包括MovieLens、Amazon Books和Wikispeedia),基于好奇心的推荐比PageRank中心性更能预测人类行为。