Video processing is generally divided into two main categories: processing of the entire video, which typically yields optimal classification outcomes, and real-time processing, where the objective is to make a decision as promptly as possible. The latter is often driven by the need to identify rapidly potential critical or dangerous situations. These could include machine failure, traffic accidents, heart problems, or dangerous behavior. Although the models dedicated to the processing of entire videos are typically well-defined and clearly presented in the literature, this is not the case for online processing, where a plethora of hand-devised methods exist. To address this, we present \our{}, a novel, unified, and theoretically-based adaptation framework for dealing with the online classification problem for video data. The initial phase of our study is to establish a robust mathematical foundation for the theory of classification of sequential data, with the potential to make a decision at an early stage. This allows us to construct a natural function that encourages the model to return an outcome much faster. The subsequent phase is to demonstrate a straightforward and readily implementable method for adapting offline models to online and recurrent operations. Finally, by comparing the proposed approach to the non-online state-of-the-art baseline, it is demonstrated that the use of \our{} encourages the network to make earlier classification decisions without compromising accuracy.
翻译:视频处理通常分为两大类:对整个视频进行处理(通常能获得最佳分类结果)和实时处理(其目标是尽可能迅速地做出决策)。后者往往源于快速识别潜在关键或危险情况的需求,例如机器故障、交通事故、心脏问题或危险行为。尽管专门用于完整视频处理的模型通常在文献中有明确定义和清晰阐述,但在线处理领域却存在大量手工设计的方法而缺乏统一框架。为此,我们提出一种新颖、统一且基于理论的自适应框架来处理视频数据的在线分类问题。我们研究的初始阶段是为序列数据分类理论建立坚实的数学基础,该理论支持早期决策的可能性。基于此,我们构建了一个激励模型更快输出结果的自然函数。后续阶段则展示了一种简单易行的离线模型适配方法,使其适用于在线循环操作。最后,通过与离线处理的先进基线方法进行比较,证明所提框架能在不损失准确性的前提下促使网络更早做出分类决策。