Data augmentation has proven to be effective in training neural networks. Recently, a method called RandAug was proposed, randomly selecting data augmentation techniques from a predefined search space. RandAug has demonstrated significant performance improvements for image-related tasks while imposing minimal computational overhead. However, no prior research has explored the application of RandAug specifically for audio data augmentation, which converts audio into an image-like pattern. To address this gap, we introduce AudRandAug, an adaptation of RandAug for audio data. AudRandAug selects data augmentation policies from a dedicated audio search space. To evaluate the effectiveness of AudRandAug, we conducted experiments using various models and datasets. Our findings indicate that AudRandAug outperforms other existing data augmentation methods regarding accuracy performance.
翻译:数据增强已被证明在训练神经网络方面卓有成效。近期,一种名为RandAug的方法被提出,该方法通过从预定义的搜索空间中随机选择数据增强技术,在图像相关任务中实现了显著的性能提升,且计算开销极小。然而,目前尚无研究探索RandAug在音频数据增强(即将音频转换为类图像模式)领域的应用。为填补这一空白,我们提出了AudRandAug——一种面向音频数据的RandAug适配方法。AudRandAug从专属音频搜索空间中选择数据增强策略。为评估其有效性,我们基于多种模型及数据集开展了实验。结果表明,在准确率性能方面,AudRandAug优于现有其他数据增强方法。