In this work, the novel Image Transformation Sequence Retrieval (ITSR) task is presented, in which a model must retrieve the sequence of transformations between two given images that act as source and target, respectively. Given certain characteristics of the challenge such as the multiplicity of a correct sequence or the correlation between consecutive steps of the process, we propose a solution to ITSR using a general model-based Reinforcement Learning such as Monte Carlo Tree Search (MCTS), which is combined with a deep neural network. Our experiments provide a benchmark in both synthetic and real domains, where the proposed approach is compared with supervised training. The results report that a model trained with MCTS is able to outperform its supervised counterpart in both the simplest and the most complex cases. Our work draws interesting conclusions about the nature of ITSR and its associated challenges.
翻译:本文提出了新颖的图像变换序列检索(ITSR)任务,要求模型检索分别作为源图像和目标图像的两张给定图像之间的变换序列。鉴于该任务中存在的正确序列多样性、以及过程连续步骤间的相关性等特征,我们提出采用基于模型的通用强化学习方法——如蒙特卡洛树搜索(MCTS)——结合深度神经网络来解决ITSR问题。我们在合成域和真实域中进行了基准实验,将所提方法与监督训练进行了对比。结果表明,在最简单和最复杂的情形下,利用MCTS训练的模型均能超越其监督学习对应模型。本研究得出了关于ITSR本质及其相关挑战的有趣结论。