XAI-based gait analysis of patients walking with Knee-Ankle-Foot orthosis using video cameras

Recent technological advancements in artificial intelligence and computer vision have enabled gait analysis on portable devices such as cell phones. However, most state-of-the-art vision-based systems still impose numerous constraints for capturing a patient's video, such as using a static camera and maintaining a specific distance from it. While these constraints are manageable under professional observation, they pose challenges in home settings. Another issue with most vision-based systems is their output, typically a classification label and confidence value, whose reliability is often questioned by medical professionals. This paper addresses these challenges by presenting a novel system for gait analysis robust to camera movements and providing explanations for its output. The study utilizes a dataset comprising videos of subjects wearing two types of Knee Ankle Foot Orthosis (KAFO), namely "Locked Knee" and "Semi-flexion," for mobility, along with metadata and ground truth for explanations. The ground truth highlights the statistical significance of seven features captured using motion capture systems to differentiate between the two gaits. To address camera movement challenges, the proposed system employs super-resolution and pose estimation during pre-processing. It then identifies the seven features - Stride Length, Step Length and Duration of single support of orthotic and non-orthotic leg, Cadence, and Speed - using the skeletal output of pose estimation. These features train a multi-layer perceptron, with its output explained by highlighting the features' contribution to classification. While most state-of-the-art systems struggle with processing the video or training on the proposed dataset, our system achieves an average accuracy of 94%. The model's explainability is validated using ground truth and can be considered reliable.

翻译：近年来，人工智能与计算机视觉技术的进步使手机等便携设备上的步态分析成为可能。然而，大多数先进的视觉系统在采集患者视频时仍存在诸多限制，如需使用固定摄像头、保持特定拍摄距离等。这些限制在专业观察环境下尚可控制，但在家庭场景中却构成挑战。另一问题是多数视觉系统输出的分类标签和置信度值常被医学专业人士质疑其可靠性。本文通过提出一套对摄像头移动具有鲁棒性且可提供输出解释的新颖步态分析系统来解决上述挑战。研究采用包含两类膝踝足矫形器（KAFO）受试者视频的数据集——分别为"锁定膝关节"和"半屈曲"模式——同时包含用于解释的元数据和真实标记。真实标记揭示了通过运动捕捉系统采集的七个特征在区分两种步态时的统计学显著性。为应对摄像头移动问题，系统在预处理阶段采用超分辨率重建和姿态估计算法，进而利用姿态估计的骨骼输出识别七个特征：步幅、步长、矫形侧与非矫形侧单支撑时长、步频和步速。这些特征被用于训练多层感知机，并通过突出各特征对分类的贡献度来解释模型输出。当多数先进系统在处理视频或训练所提数据集时存在困难时，本系统实现了94%的平均准确率。模型可解释性经真实标记验证，可视为可靠方案。