Accurately localizing objects in three dimensions (3D) is crucial for various computer vision applications, such as robotics, autonomous driving, and augmented reality. This task finds another important application in sports analytics and, in this work, we present a novel method for 3D basketball localization from a single calibrated image. Our approach predicts the object's height in pixels in image space by estimating its projection onto the ground plane within the image, leveraging the image itself and the object's location as inputs. The 3D coordinates of the ball are then reconstructed by exploiting the known projection matrix. Extensive experiments on the public DeepSport dataset, which provides ground truth annotations for 3D ball location alongside camera calibration information for each image, demonstrate the effectiveness of our method, offering substantial accuracy improvements compared to recent work. Our work opens up new possibilities for enhanced ball tracking and understanding, advancing computer vision in diverse domains. The source code of this work is made publicly available at \url{https://github.com/gabriel-vanzandycke/deepsport}.
翻译:三维空间中的目标精确定位对于机器人、自动驾驶和增强现实等计算机视觉应用至关重要。该任务在体育分析领域同样具有重要应用价值——本研究提出了一种基于单幅标定图像的三维篮球定位新方法。该方法通过将物体投影至图像地平面并联合利用图像自身信息与目标位置,实现了图像空间中以像素为单位的物体高度预测。借助已知投影矩阵,我们进一步实现了篮球三维坐标的精确重建。在DeepSport公开数据集上的大量实验表明,该数据集中包含每帧图像的三维篮球位置真值标注及相机标定参数,本方法相较于现有工作展现出显著的精度提升。该研究为增强球类跟踪与理解开辟了新路径,推动了计算机视觉在多领域的应用发展。本研究的源代码已在\url{https://github.com/gabriel-vanzandycke/deepsport}公开。