A pivotal aspect in the design of neural networks lies in selecting activation functions, crucial for introducing nonlinear structures that capture intricate input-output patterns. While the effectiveness of adaptive or trainable activation functions has been studied in domains with ample data, like image classification problems, significant gaps persist in understanding their influence on classification accuracy and predictive uncertainty in settings characterized by limited data availability. This research aims to address these gaps by investigating the use of two types of adaptive activation functions. These functions incorporate shared and individual trainable parameters per hidden layer and are examined in three testbeds derived from additive manufacturing problems containing fewer than one hundred training instances. Our investigation reveals that adaptive activation functions, such as Exponential Linear Unit (ELU) and Softplus, with individual trainable parameters, result in accurate and confident prediction models that outperform fixed-shape activation functions and the less flexible method of using identical trainable activation functions in a hidden layer. Therefore, this work presents an elegant way of facilitating the design of adaptive neural networks in scientific and engineering problems.
翻译:神经网络设计中的一个关键方面在于选择激活函数,这对于引入捕捉复杂输入-输出模式的非线性结构至关重要。虽然自适应或可训练激活函数在数据充足的领域(如图像分类问题)中的有效性已有研究,但在数据可用性有限的场景下,它们对分类准确性和预测不确定性的影响仍存在显著的知识空白。本研究旨在通过探讨两类自适应激活函数的应用来填补这些空白。这些函数在每个隐藏层中包含共享和独立的可训练参数,并在三个源自增材制造问题的测试平台上进行验证,每个测试平台包含不到一百个训练样本。我们的研究表明,具有独立可训练参数的自适应激活函数(如指数线性单元和Softplus)能够生成准确且置信度高的预测模型,其性能优于固定形状的激活函数以及在隐藏层中使用相同可训练激活函数的灵活性较低的方法。因此,这项工作为在科学和工程问题中设计自适应神经网络提供了一种优雅的途径。