Accurately detecting student behavior from classroom videos is beneficial for analyzing their classroom status and improving teaching efficiency. However, low accuracy in student classroom behavior detection is a prevalent issue. To address this issue, we propose a Spatio-Temporal Attention-Based Method for Detecting Student Classroom Behaviors (BDSTA). Firstly, the SlowFast network is used to generate motion and environmental information feature maps from the video. Then, the spatio-temporal attention module is applied to the feature maps, including information aggregation, compression and stimulation processes. Subsequently, attention maps in the time, channel and space dimensions are obtained, and multi-label behavior classification is performed based on these attention maps. To solve the long-tail data problem that exists in student classroom behavior datasets, we use an improved focal loss function to assign more weight to the tail class data during training. Experimental results are conducted on a self-made student classroom behavior dataset named STSCB. Compared with the SlowFast model, the average accuracy of student behavior classification detection improves by 8.94\% using BDSTA.
翻译:从课堂视频中准确检测学生行为,有助于分析其课堂状态并提升教学效率。然而,学生课堂行为检测精度低是一个普遍存在的问题。为解决该问题,本文提出一种基于时空注意力的学生课堂行为检测方法(BDSTA)。首先,利用SlowFast网络从视频中生成运动与环境信息特征图;随后,将时空注意力模块应用于特征图,该模块包含信息聚合、压缩与激励过程;接着,获取时间、通道与空间维度上的注意力图,并基于这些注意力图进行多标签行为分类。为解决学生课堂行为数据集中存在的长尾数据问题,我们采用改进的焦点损失函数,在训练过程中为尾部类别数据赋予更高权重。实验在自建的学生课堂行为数据集STSCB上进行。与SlowFast模型相比,使用BDSTA后学生行为分类检测的平均准确率提升了8.94%。