A popular and affordable option to provide room-scale human behaviour tracking is to rely on commodity RGB-D sensors %todo: such as the Kinect family of devices? as such devices offer body tracking capabilities at a reasonable price point. While their capabilities may be sufficient for applications such as entertainment systems where a person plays in front of a television, RGB-D sensors are sensitive to occlusions from objects or other persons that might be in the way in more complex room-scale setups. To alleviate the occlusion issue but also in order to extend the tracking range and strengthen its accuracy, it is possible to rely on multiple RGB-D sensors and perform data fusion. Unfortunately, fusing the data in a meaningful manner raises additional challenges related to the calibration of the sensors relative to each other to provide a common frame of reference, but also regarding skeleton matching and merging when actually combining the data. In this paper, we discuss our approach to tackle these challenges and present the results we achieved, through aligned point clouds and combined skeleton lists. These results successfully enable unobtrusive and occlusion-resilient human behaviour tracking at room scale, that may be used as input for interactive applications as well as (possibly remote) collaborative systems.
翻译:提供房间尺度人体行为追踪的一种流行且经济的选择是依赖商用RGB-D传感器,因为此类设备能以合理价格提供人体追踪功能。尽管其能力足以满足娱乐系统(例如人在电视机前进行游戏)等应用需求,但在更复杂的房间尺度设置中,RGB-D传感器易受途中物体或他人遮挡的影响。为缓解遮挡问题,同时扩展追踪范围并提升其精度,可采用多台RGB-D传感器并进行数据融合。然而,以有意义的方式融合数据会带来额外挑战:既涉及传感器间的相互校准以提供统一参考坐标系,也涉及实际数据合并时的骨架匹配与融合。本文讨论了应对这些挑战的方法,并通过对齐点云与合并骨架列表展示了所取得的成果。这些成果成功实现了无干扰且抗遮挡的房间尺度人体行为追踪,可作为交互式应用及(可能远程的)协作系统的输入数据。