In this paper, we first reviewed several biclustering methods that are used to identify the most significant clusters in gene expression data. Here we mainly focused on the SSVD(sparse SVD) method and tried a new sparse penalty named "Prenet penalty" which has been used only in factor analysis to gain sparsity. Then in the simulation study, we tried different types of generated datasets (with different sparsity and dimension) and tried 1-layer approximation then for k-layers which shows the mixed Prenet penalty is very effective for non-overlapped data. Finally, we used some real gene expression data to show the behavior of our methods.
翻译:本文首先回顾了若干用于识别基因表达数据中最显著簇的双聚类方法。我们主要关注SSVD(稀疏奇异值分解)方法,并尝试了一种新的稀疏惩罚项——"Prenet惩罚",该惩罚此前仅用于因子分析以实现稀疏性。在模拟研究中,我们测试了不同类型生成数据集(具有不同稀疏性和维度),并尝试了单层逼近及多层逼近,结果表明混合Prenet惩罚对非重叠数据非常有效。最后,我们利用真实基因表达数据展示了所提方法的性能。