In this paper, we present a novel algorithm to extract a quaternion from a two dimensional camera frame for estimating a contained human skeletal pose. The problem of pose estimation is usually tackled through the usage of stereo cameras and intertial measurement units for obtaining depth and euclidean distance for measurement of points in 3D space. However, the usage of these devices comes with a high signal processing latency as well as a significant monetary cost. By making use of MediaPipe, a framework for building perception pipelines for human pose estimation, the proposed algorithm extracts a quaternion from a 2-D frame capturing an image of a human object at a sub-fifty millisecond latency while also being capable of deployment at edges with a single camera frame and a generally low computational resource availability, especially for use cases involving last-minute detection and reaction by autonomous robots. The algorithm seeks to bypass the funding barrier and improve accessibility for robotics researchers involved in designing control systems.
翻译:本文提出一种新颖算法,用于从二维摄像头帧中提取四元数以估计受限人体骨骼姿态。姿态估计问题通常通过使用立体摄像头和惯性测量单元来获取深度与欧氏距离,从而测量三维空间中的点。然而,这些设备的使用会导致较高的信号处理延迟以及显著的经济成本。通过利用MediaPipe(一个用于构建人体姿态估计感知管线的框架),本算法从捕获人体目标图像的二维帧中提取四元数,延迟低于五十毫秒,同时能够部署在仅配备单个摄像头帧且计算资源普遍受限的边缘设备上,特别适用于自主机器人需进行最后一刻检测与反应的场景。该算法旨在突破资金壁垒,提高设计控制系统的机器人研究人员的可及性。