Activation functions play a significant role in the performance of deep learning algorithms. In particular, the Swish activation function tends to outperform ReLU on deeper models, including deep reinforcement learning models, across challenging tasks. Despite this progress, ReLU is the preferred function partly because it is more efficient than Swish. Furthermore, in contrast to the fields of computer vision and natural language processing, the deep reinforcement learning and robotics domains have seen less inclination to adopt new activation functions, such as Swish, and instead continue to use more traditional functions, like ReLU. To tackle those issues, we propose Swim, a general-purpose, efficient, and high-performing alternative to Swish, and then provide an analysis of its properties as well as an explanation for its high-performance relative to Swish, in terms of both reward-achievement and efficiency. We focus on testing Swim on MuJoCo's locomotion continuous control tasks since they exhibit more complex dynamics and would therefore benefit most from a high-performing and efficient activation function. We also use the TD3 algorithm in conjunction with Swim and explain this choice in the context of the robot locomotion domain. We then conclude that Swim is a state-of-the-art activation function for continuous control locomotion tasks and recommend using it with TD3 as a working framework.
翻译:摘要:激活函数在深度学习算法的性能中起着重要作用。特别是在挑战性任务中,Swish激活函数在更深层的模型(包括深度强化学习模型)上往往优于ReLU。尽管取得了这一进展,但ReLU仍是首选函数,部分原因在于它比Swish更高效。此外,与计算机视觉和自然语言处理领域不同,深度强化学习和机器人领域较少倾向于采用Swish等新激活函数,而继续使用ReLU等传统函数。为解决这些问题,我们提出Swim——一种通用、高效且高性能的Swish替代方案,并分析其特性,同时从奖励获取和效率两方面解释其相对于Swish的高性能表现。我们重点在MuJoCo的运动连续控制任务上测试Swim,因为这些任务表现出更复杂的动力学特性,因此更能从高性能且高效的激活函数中获益。我们还将TD3算法与Swim结合使用,并在机器人运动领域的背景下解释这一选择。最终得出结论:Swim是面向连续控制运动任务的最先进激活函数,并推荐将其与TD3作为工作框架配合使用。