Development and homeostasis in multicellular systems both require exquisite control over spatial molecular pattern formation and maintenance. Advances in spatially-resolved and high-throughput molecular imaging methods such as multiplexed immunofluorescence and spatial transcriptomics (ST) provide exciting new opportunities to augment our fundamental understanding of these processes in health and disease. The large and complex datasets resulting from these techniques, particularly ST, have led to rapid development of innovative machine learning (ML) tools primarily based on deep learning techniques. These ML tools are now increasingly featured in integrated experimental and computational workflows to disentangle signals from noise in complex biological systems. However, it can be difficult to understand and balance the different implicit assumptions and methodologies of a rapidly expanding toolbox of analytical tools in ST. To address this, we summarize major ST analysis goals that ML can help address and current analysis trends. We also describe four major data science concepts and related heuristics that can help guide practitioners in their choices of the right tools for the right biological questions.
翻译:多细胞系统中的发育与稳态均要求对空间分子模式的形成与维持进行精细调控。多重免疫荧光和空间转录组学(ST)等空间解析度及高通量分子成像技术的进展,为增进我们对这些过程在健康与疾病中基本原理的理解提供了令人振奋的新机遇。这些技术(尤其是ST)产生的大规模复杂数据集,推动了主要基于深度学习技术的创新机器学习(ML)工具的快速发展。如今,这些ML工具越来越多地被整合到实验与计算工作流程中,以从复杂生物系统的信号中分离噪声。然而,面对ST领域快速扩展的分析工具箱中隐含的不同假设与方法论,理解并平衡它们可能颇具难度。为解决这一问题,我们总结了ML能够助力的主要ST分析目标及当前分析趋势,并阐述了四大数据科学概念及相关启发式策略,这些策略可帮助实践者为恰当的生物学问题选择恰当的工具。