This paper introduces AutoGCN, a generic Neural Architecture Search (NAS) algorithm for Human Activity Recognition (HAR) using Graph Convolution Networks (GCNs). HAR has gained attention due to advances in deep learning, increased data availability, and enhanced computational capabilities. At the same time, GCNs have shown promising results in modeling relationships between body key points in a skeletal graph. While domain experts often craft dataset-specific GCN-based methods, their applicability beyond this specific context is severely limited. AutoGCN seeks to address this limitation by simultaneously searching for the ideal hyperparameters and architecture combination within a versatile search space using a reinforcement controller while balancing optimal exploration and exploitation behavior with a knowledge reservoir during the search process. We conduct extensive experiments on two large-scale datasets focused on skeleton-based action recognition to assess the proposed algorithm's performance. Our experimental results underscore the effectiveness of AutoGCN in constructing optimal GCN architectures for HAR, outperforming conventional NAS and GCN methods, as well as random search. These findings highlight the significance of a diverse search space and an expressive input representation to enhance the network performance and generalizability.
翻译:本文提出了一种名为AutoGCN的通用神经架构搜索算法,用于基于图卷积网络的人体活动识别。得益于深度学习的进步、数据可用性的增强以及计算能力的提升,人体活动识别已受到广泛关注。同时,图卷积网络在骨骼图中建模人体关键点之间关系方面展现出显著成效。尽管领域专家常常针对特定数据集设计基于图卷积网络的方法,但此类方法在特定场景之外的适用性极为有限。AutoGCN旨在解决这一局限性:通过一个灵活的搜索空间,利用强化学习控制器同时搜索最优超参数与架构组合,并在搜索过程中借助知识库平衡最优探索与利用行为。我们在两个面向骨骼动作识别的大规模数据集上开展了广泛实验,以评估所提算法的性能。实验结果证实了AutoGCN在构建面向人体活动识别的最优图卷积网络架构方面的有效性,其性能优于传统神经架构搜索方法、图卷积网络方法以及随机搜索。这些发现凸显了多样化搜索空间与富有表达力的输入表示对于提升网络性能与泛化能力的重要性。