The high-level structure of a graph is a crucial ingredient for the analysis and visualization of relational data. However, discovering the salient graph patterns that form this structure is notoriously difficult for two reasons. (1) Finding important patterns, such as cliques and bicliques, is computationally hard. (2) Real-world graphs contain noise, and therefore do not always exhibit patterns in their pure form. Defining meaningful noisy patterns and detecting them efficiently is a currently unsolved challenge. In this paper, we propose to use well-ordered matrices as a tool to both define and effectively detect noisy patterns. Specifically, we represent a graph as its adjacency matrix and optimally order it using Moran's $I$. Standard graph patterns (cliques, bicliques, and stars) now translate to rectangular submatrices. Using Moran's $I$, we define a permitted level of noise for such patterns. A combination of exact algorithms and heuristics allows us to efficiently decompose the matrix into noisy patterns. We also introduce a novel motif simplification that visualizes noisy patterns while explicitly encoding the level of noise. We showcase our techniques on several real-world data sets.
翻译:图的高层结构是关系数据分析和可视化的关键要素。然而,发现构成该结构的显著图模式因两个原因而异常困难:(1) 寻找重要模式(如团和双团)在计算上是困难的;(2) 现实世界中的图包含噪声,因此并不总是以纯粹形式展现模式。定义有意义的噪声模式并高效检测它们,是当前尚未解决的挑战。本文提出使用良序矩阵作为工具,来定义并有效检测噪声模式。具体而言,我们将图表示为其邻接矩阵,并使用莫兰指数$I$对其进行最优排序。标准图模式(团、双团和星形)现在转化为矩形子矩阵。利用莫兰指数$I$,我们为此类模式定义了允许的噪声水平。通过精确算法与启发式方法的结合,我们能够高效地将矩阵分解为噪声模式。我们还引入了一种新颖的基元简化方法,可在可视化噪声模式的同时明确编码噪声水平。我们在多个真实世界数据集上展示了我们的技术。