The problem of estimating the structure of a graph from observed data is of growing interest in the context of high-throughput genomic data, and single-cell RNA sequencing in particular. These, however, are challenging applications, since the data consist of high-dimensional counts with high variance and over-abundance of zeros. Here, we present a general framework for learning the structure of a graph from single-cell RNA-seq data, based on the zero-inflated negative binomial distribution. We demonstrate with simulations that our approach is able to retrieve the structure of a graph in a variety of settings and we show the utility of the approach on real data.
翻译:从观测数据中估计图结构的问题在高通量基因组数据(特别是单细胞RNA测序数据)领域日益受到关注。然而,这些应用面临挑战,因为数据由高维计数构成,具有高方差和零值过度丰裕的特点。本文提出一个基于零膨胀负二项分布的通用框架,用于从单细胞RNA-seq数据中学习图结构。通过模拟实验,我们证明该方法能够在多种场景下恢复图结构,并在实际数据中展示了该方法的实用性。