We propose a novel way of representing and analysing single-cell genomic count data, by modelling the observed data count matrix as a network adjacency matrix. This perspective enables theory from stochastic networks modelling to be applied in a principled way to this type of data, providing new ways to view and analyse these data, and giving first-principles theoretical justification to established, successful methods. We show the success of this approach in the context of three cell-biological contexts, from the epiblast/epithelial/neural lineage. New technology has made it possible to gather genomic data from single cells at unprecedented scale, and this brings with it new challenges to deal with much higher levels of heterogeneity than expected between individual cells. Novel, tailored, computational-statistical methodology is needed to make the most of these new types of data, involving collaboration between mathematical and biomedical scientists.
翻译:我们提出了一种新颖的单细胞基因组计数数据表示与分析方法,通过将观测到的数据计数矩阵建模为网络邻接矩阵。这一视角使得随机网络建模理论能够以原则性方式应用于此类数据,为数据观察与分析提供了新途径,并为已有成功方法提供了基于第一性原理的理论依据。我们基于外胚层/上皮/神经谱系的三个细胞生物学背景验证了该方法的有效性。新技术使得以前所未有的规模从单细胞中收集基因组数据成为可能,但同时也带来了新挑战,即需应对个体细胞间远超预期的异质性。为充分利用这些新型数据,需要数学与生物医学科学家合作开发定制化的计算统计新方法。