This paper presents a methodological analysis of the gesture-recognition approach proposed by Liu and Szirányi, with a particular focus on the validity of their evaluation protocol. We show that the reported near-perfect accuracy metrics result from a frame-level random train-test split that inevitably mixes samples from the same subjects across both sets, causing severe data leakage. By examining the published confusion matrix, learning curves, and dataset construction, we demonstrate that the evaluation does not measure generalization to unseen individuals. Our findings underscore the importance of subject-independent data partitioning in vision-based gesture-recognition research, especially for applications - such as UAV-human interaction - that require reliable recognition of gestures performed by previously unseen people.
翻译:本文对Liu和Szirányi提出的手势识别方法进行了方法论分析,特别关注其评估协议的有效性。研究表明,其报道的近乎完美的准确度指标源于帧级随机训练-测试划分,这种划分方式不可避免地混合了来自相同主体的样本,导致严重的数据泄露。通过分析已发布的混淆矩阵、学习曲线及数据集构建方式,我们证明该评估方法无法衡量模型对未见个体的泛化能力。我们的发现强调了在基于视觉的手势识别研究中主体独立性数据划分的重要性,尤其对于无人机-人机交互等需要对未见人员手势进行可靠识别的应用场景。