In the development of science, accurate and reproducible documentation of the experimental process is crucial. Automatic recognition of the actions in experiments from videos would help experimenters by complementing the recording of experiments. Towards this goal, we propose FineBio, a new fine-grained video dataset of people performing biological experiments. The dataset consists of multi-view videos of 32 participants performing mock biological experiments with a total duration of 14.5 hours. One experiment forms a hierarchical structure, where a protocol consists of several steps, each further decomposed into a set of atomic operations. The uniqueness of biological experiments is that while they require strict adherence to steps described in each protocol, there is freedom in the order of atomic operations. We provide hierarchical annotation on protocols, steps, atomic operations, object locations, and their manipulation states, providing new challenges for structured activity understanding and hand-object interaction recognition. To find out challenges on activity understanding in biological experiments, we introduce baseline models and results on four different tasks, including (i) step segmentation, (ii) atomic operation detection (iii) object detection, and (iv) manipulated/affected object detection. Dataset and code are available from https://github.com/aistairc/FineBio.
翻译:摘要:在科学发展过程中,实验过程的精确且可重现记录至关重要。从视频中自动识别实验动作,有助于实验者补充实验记录。为此,我们提出FineBio,这是一个全新的细粒度视频数据集,记录了人们执行生物实验的过程。该数据集包含32名参与者进行模拟生物实验的多视角视频,总时长14.5小时。每项实验构成一个层级结构,其中协议包含若干步骤,每个步骤进一步分解为一组原子操作。生物实验的独特之处在于,虽需严格遵守各协议描述的步骤顺序,但原子操作的执行顺序具有灵活性。我们提供了协议、步骤、原子操作、物体位置及其操作状态的层级标注,为结构化活动理解和手物交互识别提出了新的挑战。为探究生物实验活动理解中的难点,我们引入基线模型,并在四项不同任务上展示结果,包括:(i)步骤分割、(ii)原子操作检测、(iii)物体检测、(iv)被操作/受影响物体检测。数据集和代码可从https://github.com/aistairc/FineBio获取。