Determining cell identities in imaging sequences is an important yet challenging task. The conventional method for cell identification is via cell tracking, which is complex and can be time-consuming. In this study, we propose an innovative approach to cell identification during early $\textit{C. elegans}$ embryogenesis using machine learning. Cell identification during $\textit{C. elegans}$ embryogenesis would provide insights into neural development with implications for higher organisms including humans. We employed random forest, MLP, and LSTM models, and tested cell classification accuracy on 3D time-lapse confocal datasets spanning the first 4 hours of embryogenesis. By leveraging a small number of spatial-temporal features of individual cells, including cell trajectory and cell fate information, our models achieve an accuracy of over 91%, even with limited data. We also determine the most important feature contributions and can interpret these features in the context of biological knowledge. Our research demonstrates the success of predicting cell identities in time-lapse imaging sequences directly from simple spatio-temporal features.
翻译:在成像序列中确定细胞身份是一项重要且具有挑战性的任务。传统的细胞识别方法是通过细胞追踪,该方法复杂且耗时。在本研究中,我们提出了一种利用机器学习在秀丽隐杆线虫早期胚胎发生过程中进行细胞识别的创新方法。识别秀丽隐杆线虫胚胎发生过程中的细胞身份,可为包括人类在内的高等生物的神经发育提供见解。我们采用了随机森林、MLP和LSTM模型,并在覆盖胚胎发生前4小时的3D延时共聚焦数据集上测试了细胞分类的准确性。通过利用单个细胞的少量时空特征,包括细胞轨迹和细胞命运信息,我们的模型即使在数据有限的情况下也能达到超过91%的准确率。我们还确定了最重要的特征贡献,并能够在生物学知识的背景下解读这些特征。本研究证明了直接从简单的时空特征预测延时成像序列中细胞身份的成功性。