Graph-based fraud detection on text-attributed graphs (TAGs) requires jointly modeling rich textual semantics and relational dependencies. However, existing LLM-enhanced GNN approaches are constrained by predefined prompting and decoupled training pipelines, limiting reasoning autonomy and weakening semantic-structural alignment. We propose FraudCoT, a unified framework that advances TAG-based fraud detection through autonomous, graph-aware chain-of-thought (CoT) reasoning and scalable LLM-GNN co-training. To address the limitations of predefined prompts, we introduce a fraud-aware selective CoT distillation mechanism that generates diverse reasoning paths and enhances semantic-structural understanding. These distilled CoTs are integrated into node texts, providing GNNs with enriched, multi-hop semantic and structural cues for fraud detection. Furthermore, we develop an efficient asymmetric co-training strategy that enables end-to-end optimization while significantly reducing the computational cost of naive joint training. Extensive experiments on public and industrial benchmarks demonstrate that FraudCoT achieves up to 8.8% AUPRC improvement over state-of-the-art methods and delivers up to 1,066x speedup in training throughput, substantially advancing both detection performance and efficiency.
翻译:基于文本属性图(TAGs)的图欺诈检测需要联合建模丰富的文本语义和关系依赖。然而,现有的LLM增强GNN方法受限于预定义的提示和解耦的训练流程,限制了推理的自主性并削弱了语义-结构对齐。我们提出了FraudCoT,一个通过自主、图感知的思维链(CoT)推理和可扩展的LLM-GNN协同训练来推进基于TAG的欺诈检测的统一框架。为应对预定义提示的局限性,我们引入了一种欺诈感知的选择性CoT蒸馏机制,该机制能生成多样化的推理路径并增强语义-结构理解。这些蒸馏后的CoT被整合到节点文本中,为GNN提供了用于欺诈检测的、丰富的多跳语义和结构线索。此外,我们开发了一种高效的非对称协同训练策略,该策略实现了端到端优化,同时显著降低了朴素联合训练的计算成本。在公共和工业基准上的大量实验表明,FraudCoT相比最先进方法实现了高达8.8%的AUPRC提升,并在训练吞吐量上实现了高达1066倍的加速,在检测性能和效率两方面均取得了显著进展。