Improving Quaternion Neural Networks with Quaternionic Activation Functions

In this paper, we propose novel quaternion activation functions where we modify either the quaternion magnitude or the phase, as an alternative to the commonly used split activation functions. We define criteria that are relevant for quaternion activation functions, and subsequently we propose our novel activation functions based on this analysis. Instead of applying a known activation function like the ReLU or Tanh on the quaternion elements separately, these activation functions consider the quaternion properties and respect the quaternion space $\mathbb{H}$. In particular, all quaternion components are utilized to calculate all output components, carrying out the benefit of the Hamilton product in e.g. the quaternion convolution to the activation functions. The proposed activation functions can be incorporated in arbitrary quaternion valued neural networks trained with gradient descent techniques. We further discuss the derivatives of the proposed activation functions where we observe beneficial properties for the activation functions affecting the phase. Specifically, they prove to be sensitive on basically the whole input range, thus improved gradient flow can be expected. We provide an elaborate experimental evaluation of our proposed quaternion activation functions including comparison with the split ReLU and split Tanh on two image classification tasks using the CIFAR-10 and SVHN dataset. There, especially the quaternion activation functions affecting the phase consistently prove to provide better performance.

翻译：本文提出了一种新型四元数激活函数，通过修改四元数的模长或相位来替代常用的分离式激活函数。我们定义了适用于四元数激活函数的设计准则，并基于此分析提出了新型激活函数。与对四元数各分量单独应用ReLU或Tanh等已知激活函数的传统方式不同，这些激活函数充分考虑了四元数的特性并尊重四元数空间$\mathbb{H}$的数学结构。特别地，所有四元数分量都被用于计算所有输出分量，从而将四元数卷积中汉密尔顿积的优势延伸至激活函数领域。所提出的激活函数可集成到任意采用梯度下降技术训练的四元数值神经网络中。我们进一步讨论了所提激活函数的导数特性，发现影响相位的激活函数具有显著优势：它们在整个输入范围内均保持敏感性，从而有望改善梯度流动。我们在CIFAR-10和SVHN两个图像分类任务上进行了详尽的实验评估，将所提出的四元数激活函数与分离式ReLU及分离式Tanh进行对比。实验结果表明，特别是影响相位的四元数激活函数能持续提供更优的性能表现。

相关内容

激活函数

关注 44

在人工神经网络中，给定一个输入或一组输入，节点的激活函数定义该节点的输出。一个标准集成电路可以看作是一个由激活函数组成的数字网络，根据输入的不同，激活函数可以是开(1)或关(0)。这类似于神经网络中的线性感知器的行为。然而，只有非线性激活函数允许这样的网络只使用少量的节点来计算重要问题，并且这样的激活函数被称为非线性。

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日