On signed social networks, balanced and unbalanced triangles are a critical motif due to their role as the foundations of Structural Balance Theory. The uses for these motifs have been extensively explored in networks with known edge signs, however in the real-world graphs with ground-truth signs are near non-existent, particularly on a large-scale. In reality, edge signs are inferred via various techniques with differing levels of confidence, meaning the edge signs on these graphs should be modelled with a probability value. In this work, we adapt balanced and unbalanced triangles to a setting with uncertain edge signs and explore the problems of triangle counting and enumeration. We provide a baseline and improved method (leveraging the inherent information provided by the edge probabilities in order to reduce the search space) for fast exact counting and enumeration. We also explore approximate solutions for counting via different sampling approaches, including leveraging insights from our improved exact solution to significantly reduce the runtime of each sample resulting in upwards of two magnitudes more queries executed per second. We evaluate the efficiency of all our solutions as well as examine the effectiveness of our sampling approaches on real-world topological networks with a variety of probability distributions.
翻译:在带符号的社交网络中,平衡三角形与不平衡三角形因其作为结构平衡理论基石的作用而成为关键模体。这些模体的用途在边符号已知的网络中已得到广泛探索,然而在现实世界中,具有真实符号的图几乎不存在,尤其是在大规模场景下。实际上,边符号是通过多种技术以不同置信度推断得出的,这意味着这些图中的边符号应以概率值进行建模。在本工作中,我们将平衡与不平衡三角形适配到边符号不确定的场景中,并探讨三角形计数与枚举问题。我们为快速精确计数与枚举提供了基线方法及改进方法(利用边概率所固有的信息以缩减搜索空间)。我们还通过不同采样方法探索了近似计数解决方案,包括利用改进精确解中的洞见以显著降低每次采样的运行时间,从而实现每秒执行查询数量提升两个数量级以上。我们评估了所有解决方案的效率,并在具有多种概率分布的真实世界拓扑网络上检验了采样方法的有效性。