Deep learning has yielded remarkable outcomes in various domains. However, the challenge of requiring large-scale labeled samples still persists in deep learning. Thus, data augmentation has been introduced as a critical strategy to train deep learning models. However, data augmentation suffers from information loss and poor performance in small sample environments. To overcome these drawbacks, we propose a feature augmentation method based on shape space theory, i.e., feature augmentation on Geodesic curve, called FAGC in brevity.First, we extract features from the image with the neural network model. Then, the multiple image features are projected into a pre-shape space as features. In the pre-shape space, a Geodesic curve is built to fit the features. Finally, the many generated features on the Geodesic curve are used to train the various machine learning models. The FAGC module can be seamlessly integrated with most machine learning methods. And the proposed method is simple, effective and insensitive for the small sample datasets.Several examples demonstrate that the FAGC method can greatly improve the performance of the data preprocessing model in a small sample environment.
翻译:深度学习在各领域取得了显著成果,但依赖大规模标注样本的挑战依然存在。数据增强作为训练深度学习模型的关键策略被引入,然而该方法存在信息丢失问题,且在少样本环境下表现欠佳。为克服这些缺陷,我们提出基于形状空间理论的特征增强方法——即测地曲线上的特征增强,简称FAGC。首先,通过神经网络模型从图像中提取特征;其次,将多个图像特征投影至预形状空间作为特征表达;然后在预形状空间中构建拟合这些特征的测地曲线;最后,利用测地曲线上生成的多个特征训练各类机器学习模型。FAGC模块可与多数机器学习方法无缝集成,该方法简单有效,且对少样本数据集不敏感。实验表明,FAGC方法能显著提升少样本环境下数据预处理模型的性能。