Reinforced Labels: Multi-Agent Deep Reinforcement Learning for Point-feature Label Placement

Over the past few years, Reinforcement Learning combined with Deep Learning techniques has successfully proven to solve complex problems in various domains including robotics, self-driving cars, finance, and gaming. In this paper, we are introducing Reinforcement Learning (RL) to another domain - visualization. Our novel point-feature label placement method utilizes Multi-Agent Deep Reinforcement Learning (MADRL) to learn label placement strategy, which is the first machine-learning-driven labeling method in contrast to existing hand-crafted algorithms designed by human experts. To facilitate the RL learning paradigm, we developed an environment where an agent acts as a proxy for a label, a short textual annotation that augments visualizations like geographical maps, illustrations, and technical drawings. Our results demonstrate that the strategy trained by our method significantly outperforms the random strategy of an untrained agent and also performs superior to the compared methods designed by human experts in terms of completeness (i.e., the number of placed labels). The trade-off is increased computation time, making the proposed method slower than compared methods. Nevertheless, our method is ideal for situations where the labeling can be computed in advance, and completeness is essential, such as cartographic maps, technical drawings, and medical atlases. Additionally, we conducted a user study to assess the perceived performance. The outcomes revealed that the participants considered the proposed method to be significantly better than the other examined methods. This indicates that the improved completeness is not just reflected in the quantitative metrics but also in the subjective evaluation of the participants.

翻译：近年来，强化学习与深度学习技术的结合已成功证明可解决机器人、自动驾驶、金融及游戏等众多领域的复杂问题。本文首次将强化学习引入另一领域——可视化。我们提出的创新性点要素标签放置方法采用多智能体深度强化学习来学习标签放置策略，这是首个由机器学习驱动的标签方法，与现有由人类专家设计的传统手工算法形成鲜明对比。为适配强化学习范式，我们构建了一个环境：智能体充当标签代理——即一种短文本注释，用于增强地图、插画及技术制图等可视化内容。结果表明，该方法训练所得策略在完整性（即放置标签数量）上显著优于未训练智能体的随机策略，且超越人类专家设计的对比方法。代价是计算时间增加，导致所提方法速度较慢。尽管如此，该方法非常适用于可预先计算标签且完整性至关重要的场景，例如制图地图、技术制图及医学图谱。此外，我们通过用户研究评估感知性能，结果显示参与者认为所提方法显著优于其他方法。这表明改进的完整性不仅体现在定量指标中，也反映在参与者的主观评价中。