In this paper, we study two well known methods of Ising structure learning, namely the pseudolikelihood approach and the interaction screening approach, in the context of tensor recovery in $k$-spin Ising models. We show that both these approaches, with proper regularization, retrieve the underlying hypernetwork structure using a sample size logarithmic in the number of network nodes, and exponential in the maximum interaction strength and maximum node-degree. We also track down the exact dependence of the rate of tensor recovery on the interaction order $k$, that is allowed to grow with the number of samples and nodes, for both the approaches. We then provide a comparative discussion of the performance of the two approaches based on simulation studies, which also demonstrates the exponential dependence of the tensor recovery rate on the maximum coupling strength. Our tensor recovery methods are then applied on gene data taken from the Curated Microarray Database (CuMiDa), where we focus on understanding the important genes related to hepatocellular carcinoma.
翻译:本文研究了伊辛模型结构学习中两种著名方法——伪似然方法与交互筛选方法——在$k$-自旋伊辛模型张量恢复背景下的应用。我们证明这两种方法在适当正则化条件下,能够以网络节点数对数的样本量恢复潜在超网络结构,且样本量需求随最大交互强度与最大节点度呈指数增长。我们精确追踪了两种方法中张量恢复速率与交互阶数$k$的依赖关系,该阶数允许随样本量和节点数增长而变化。随后基于仿真研究对两种方法的性能进行比较讨论,该研究同时证明了张量恢复速率与最大耦合强度的指数依赖关系。最后将我们的张量恢复方法应用于来自CuMiDa(Curated Microarray Database)的基因数据,重点探究与肝细胞癌相关的重要基因。