Community Search: A Meta-Learning Approach

Community Search (CS) is one of the fundamental graph analysis tasks, which is a building block of various real applications. Given any query nodes, CS aims to find cohesive subgraphs that query nodes belong to. Recently, a large number of CS algorithms are designed. These algorithms adopt predefined subgraph patterns to model the communities, which cannot find ground-truth communities that do not have such pre-defined patterns in real-world graphs. Thereby, machine learning (ML) and deep learning (DL) based approaches are proposed to capture flexible community structures by learning from ground-truth communities in a data-driven fashion. These approaches rely on sufficient training data to provide enough generalization for ML models, however, the ground-truth cannot be comprehensively collected beforehand. In this paper, we study ML/DL-based approaches for CS, under the circumstance of small training data. Instead of directly fitting the small data, we extract prior knowledge which is shared across multiple CS tasks via learning a meta model. Each CS task is a graph with several queries that possess corresponding partial ground-truth. The meta model can be swiftly adapted to a task to be predicted by feeding a few task-specific training data. We find that trivially applying multiple classical metalearning algorithms to CS suffers from problems regarding prediction effectiveness, generalization capability and efficiency. To address such problems, we propose a novel meta-learning based framework, Conditional Graph Neural Process (CGNP), to fulfill the prior extraction and adaptation procedure. A meta CGNP model is a task-common node embedding function for clustering, learned by metric-based graph learning, which fully exploits the characteristics of CS. We compare CGNP with CS algorithms and ML baselines on real graphs with ground-truth communities.

翻译：社区搜索（CS）是基础性图分析任务之一，也是多种实际应用的构建模块。给定任意查询节点，CS旨在寻找查询节点所属的凝聚性子图。近年来，大量CS算法被设计出来。这些算法采用预定义的子图模式对社区进行建模，因而无法在真实世界图中找到不具备此类预定义模式的真实社区。为此，研究者提出基于机器学习和深度学习的方法，通过从真实社区中学习以数据驱动的方式捕获灵活社区结构。此类方法依赖充足的训练数据为机器学习模型提供充分泛化能力，然而真实社区信息无法预先全面采集。本文研究小规模训练数据场景下基于机器学习/深度学习的CS方法。我们并非直接拟合少量数据，而是通过学习元模型提取多个CS任务间共享的先验知识。每个CS任务对应一个包含若干查询及相应部分真实社区的图。该元模型可通过输入少量任务特定训练数据，快速适配至待预测任务。我们发现将多种经典元学习算法简单应用于CS时，会面临预测有效性、泛化能力和效率方面的问题。为解决这些问题，我们提出新型基于元学习的框架——条件图神经过程（CGNP），用于实现先验提取与适配过程。元CGNP模型是通过基于度量的图学习得到的任务通用节点聚类嵌入函数，该函数充分利用了CS的特性。我们将CGNP与CS算法及机器学习基线方法在包含真实社区的真实图上进行对比。

相关内容

计算机科学

关注 56

计算机科学（Computer Science, CS）是系统性研究信息与计算的理论基础以及它们在计算机系统中如何实现与应用的实用技术的学科。它通常被形容为对那些创造、描述以及转换信息的算法处理的系统研究。计算机科学包含很多分支领域；其中一些，比如计算机图形学强调特定结果的计算，而另外一些，比如计算复杂性理论是学习计算问题的性质。还有一些领域专注于挑战怎样实现计算。比如程序设计语言理论学习描述计算的方法，而程序设计是应用特定的程序设计语言解决特定的计算问题，人机交互则是专注于挑战怎样使计算机和计算变得有用、可用，以及随时随地为人所用。 现代计算机科学( Computer Science)包含理论计算机科学和应用计算机科学两大分支。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日