Knowledge hypergraphs surpass traditional binary knowledge graphs by encapsulating complex $n$-ary atomic facts, providing a more comprehensive paradigm for semantic representation. However, constructing high-quality hypergraphs remains challenging due to the \textit{scenario gap}: generic extractors struggle to generalize across diverse domains with specific jargon, while existing methods often fail to balance structural skeletons with fine-grained details. To bridge this gap, we propose \textbf{Hyper-KGGen}, a skill-driven framework that reformulates extraction as a dynamic skill-evolving process. First, Hyper-KGGen employs a \textit{coarse-to-fine} mechanism to systematically decompose documents, ensuring full-dimensional coverage from binary links to complex hyperedges. Crucially, it incorporates an \textit{adaptive skill acquisition} module that actively distills domain expertise into a Global Skill Library. This is achieved via a stability-based feedback loop, where extraction stability serves as a relative reward signal to induce high-quality skills from unstable traces and missed predictions. Additionally, we present \textbf{HyperDocRED}, a rigorously annotated benchmark for document-level knowledge hypergraph extraction. Experiments demonstrate that Hyper-KGGen significantly outperforms strong baselines, validating that evolved skills provide substantially richer guidance than static few-shot examples in multi-scenario settings.
翻译:知识超图通过封装复杂的$n$元原子事实,超越了传统的二元知识图谱,为语义表示提供了更全面的范式。然而,由于存在\textit{场景鸿沟}:通用抽取器难以在具有特定术语的多样化领域间泛化,而现有方法往往无法平衡结构骨架与细粒度细节,构建高质量超图仍然具有挑战性。为弥合此鸿沟,我们提出\textbf{Hyper-KGGen},一个将抽取重新定义为动态技能演化过程的技能驱动框架。首先,Hyper-KGGen采用\textit{由粗到细}机制系统分解文档,确保从二元链接到复杂超边的全维度覆盖。关键的是,它引入了一个\textit{自适应技能获取}模块,主动将领域专业知识提炼至全局技能库。这是通过一个基于稳定性的反馈循环实现的,其中抽取稳定性作为相对奖励信号,从不稳定轨迹和遗漏预测中诱导出高质量技能。此外,我们提出了\textbf{HyperDocRED},一个用于文档级知识超图抽取的严格标注基准。实验表明,Hyper-KGGen显著优于强基线方法,验证了在多场景设置中,演化技能能提供比静态少样本示例更丰富的指导。