Molecular property is usually observed with a limited number of samples, and researchers have considered property prediction as a few-shot problem. One important fact that has been ignored by prior works is that each molecule can be recorded with several different properties simultaneously. To effectively utilize many-to-many correlations of molecules and properties, we propose a Graph Sampling-based Meta-learning (GS-Meta) framework for few-shot molecular property prediction. First, we construct a Molecule-Property relation Graph (MPG): molecule and properties are nodes, while property labels decide edges. Then, to utilize the topological information of MPG, we reformulate an episode in meta-learning as a subgraph of the MPG, containing a target property node, molecule nodes, and auxiliary property nodes. Third, as episodes in the form of subgraphs are no longer independent of each other, we propose to schedule the subgraph sampling process with a contrastive loss function, which considers the consistency and discrimination of subgraphs. Extensive experiments on 5 commonly-used benchmarks show GS-Meta consistently outperforms state-of-the-art methods by 5.71%-6.93% in ROC-AUC and verify the effectiveness of each proposed module. Our code is available at https://github.com/HICAI-ZJU/GS-Meta.
翻译:分子性质通常仅有少量样本可观测,研究者将性质预测视为小样本问题。现有工作忽略了一个重要事实:同一分子可同时记录多种不同性质。为有效利用分子与性质间的多对多关联,我们提出基于图采样的元学习框架(GS-Meta)用于小样本分子性质预测。首先,构建分子-性质关系图(MPG):分子和性质作为节点,性质标签决定边的连接。其次,为利用MPG的拓扑信息,将元学习中的情节(episode)重构为MPG的子图,包含目标性质节点、分子节点及辅助性质节点。第三,由于子图形式的情节不再相互独立,我们通过对比损失函数调度子图采样过程,该损失函数兼顾子图的一致性与判别性。在5个常用基准上的大量实验表明,GS-Meta在ROC-AUC指标上始终以5.71%-6.93%的优势超越现有最优方法,验证了各模块的有效性。我们的代码开源在https://github.com/HICAI-ZJU/GS-Meta。