The classification of microscopy videos capturing complex cellular behaviors is crucial for understanding and quantifying the dynamics of biological processes over time. However, it remains a frontier in computer vision, requiring approaches that effectively model the shape and motion of objects without rigid boundaries, extract hierarchical spatiotemporal features from entire image sequences rather than static frames, and account for multiple objects within the field of view. To this end, we organized the Cell Behavior Video Classification Challenge (CBVCC), benchmarking 35 methods based on three approaches: classification of tracking-derived features, end-to-end deep learning architectures to directly learn spatiotemporal features from the entire video sequence without explicit cell tracking, or ensembling tracking-derived with image-derived features. We discuss the results achieved by the participants and compare the potential and limitations of each approach, serving as a basis to foster the development of computer vision methods for studying cellular dynamics.
翻译:对捕捉复杂细胞行为的显微视频进行分类,对于理解和量化生物过程随时间变化的动态至关重要。然而,这仍然是计算机视觉领域的一个前沿课题,需要能够有效建模无刚性边界物体的形状与运动、从整个图像序列而非静态帧中提取层次化的时空特征,并考虑视场内多个对象的方法。为此,我们组织了细胞行为视频分类挑战赛,对基于三种途径的35种方法进行了基准测试:基于跟踪衍生特征的分类、无需显式细胞跟踪即可直接从整个视频序列学习时空特征的端到端深度学习架构,或将跟踪衍生特征与图像衍生特征进行集成。我们讨论了参赛者取得的结果,并比较了每种方法的潜力与局限性,以此为推动研究细胞动力学的计算机视觉方法发展奠定基础。