Hierarchical Compositional Representations for Few-shot Action Recognition

Recently action recognition has received more and more attention for its comprehensive and practical applications in intelligent surveillance and human-computer interaction. However, few-shot action recognition has not been well explored and remains challenging because of data scarcity. In this paper, we propose a novel hierarchical compositional representations (HCR) learning approach for few-shot action recognition. Specifically, we divide a complicated action into several sub-actions by carefully designed hierarchical clustering and further decompose the sub-actions into more fine-grained spatially attentional sub-actions (SAS-actions). Although there exist large differences between base classes and novel classes, they can share similar patterns in sub-actions or SAS-actions. Furthermore, we adopt the Earth Mover's Distance in the transportation problem to measure the similarity between video samples in terms of sub-action representations. It computes the optimal matching flows between sub-actions as distance metric, which is favorable for comparing fine-grained patterns. Extensive experiments show our method achieves the state-of-the-art results on HMDB51, UCF101 and Kinetics datasets.

翻译：近年来，动作识别因其在智能监控和人机交互中的综合实用价值而受到日益广泛的关注。然而，由于数据稀缺，小样本动作识别尚未得到充分探索且仍具挑战性。本文提出一种新颖的层次化组合表示学习（HCR）方法，用于小样本动作识别。具体而言，我们通过精心设计的层次聚类将复杂动作分解为若干子动作，并进一步将子动作分解为更细粒度的空间注意力子动作（SAS-action）。尽管基类与新颖类之间存在较大差异，但它们在子动作或SAS-action层级上可共享相似模式。此外，我们采用运输问题中的推土机距离，以子动作表示为基础度量视频样本间的相似性。该方法通过计算子动作间的最优匹配流作为距离度量，有利于细粒度模式的比较。大量实验表明，我们的方法在HMDB51、UCF101和Kinetics数据集上均取得了最先进的性能。

相关内容

小样本学习

关注 216

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日