Tiny Actions Challenge focuses on understanding human activities in real-world surveillance. Basically, there are two main difficulties for activity recognition in this scenario. First, human activities are often recorded at a distance, and appear in a small resolution without much discriminative clue. Second, these activities are naturally distributed in a long-tailed way. It is hard to alleviate data bias for such heavy category imbalance. To tackle these problems, we propose a comprehensive recognition solution in this paper. First, we train video backbones with data balance, in order to alleviate overfitting in the challenge benchmark. Second, we design a dual-resolution distillation framework, which can effectively guide low-resolution action recognition by super-resolution knowledge. Finally, we apply model en-semble with post-processing, which can further boost per-formance on the long-tailed categories. Our solution ranks Top-1 on the leaderboard.
翻译:微小动作挑战旨在理解真实世界监控场景中的人类活动。该场景下的活动识别主要面临两大困难:首先,人类活动通常被远距离记录,呈现为低分辨率且缺乏显著判别性特征;其次,这些活动天然呈现长尾分布,难以缓解由严重类别不平衡导致的数据偏差。为解决这些问题,本文提出一套综合识别方案:首先,采用数据平衡策略训练视频骨干网络,以减轻挑战基准中的过拟合现象;其次,设计双分辨率蒸馏框架,通过超分辨率知识有效指导低分辨率动作识别;最后,结合后处理技术进行模型集成,进一步提升长尾类别的性能。本方案在排行榜中位列第一。