Single-cell RNA sequencing (scRNA-seq) determines RNA expression at single-cell resolution. It provides a powerful tool for studying immunity, regulation, and other life activities of cells. However, due to the limitations of the sequencing technique, the scRNA-seq data are represented with sparsity, whichcontains missing gene values, i.e., zero values, called dropout. Therefore, it is necessary to impute missing values before analyzing scRNA-seq data. However, existing imputation computation methods often only focus on the identification of technical zeros or imputing all zeros based on cell similarity. This study proposes a new method (SFAG) to reconstruct the gene expression relationship matrix by usinggraph regularization technology to preserve the high-dimensional manifold information of the data, andto mine the relationship between genes and cells in the data, and then uses a method of averaging the clustering results to fill in the identified technical zeros. Experimental results show that SFAGcan helpimprove downstream analysis and reconstruct cell trajectory
翻译:单细胞RNA测序(scRNA-seq)以单细胞分辨率测定RNA表达水平,为研究免疫、调控等细胞生命活动提供了有力工具。然而,由于测序技术的局限性,scRNA-seq数据呈现稀疏性,其中包含缺失的基因表达值(即零值,称为dropout)。因此,在分析scRNA-seq数据前,必须对缺失值进行插补。现有插补计算方法往往仅关注技术性零值的识别,或基于细胞相似性对所有零值进行插补。本研究提出一种新方法(SFAG),利用图正则化技术重建基因表达关系矩阵,以保留数据的高维流形信息,挖掘数据中基因与细胞间的关联,并通过聚类结果取均值的方法对识别出的技术性零值进行填补。实验结果表明,SFAG能够有效提升下游分析质量并重建细胞轨迹。