CGRA-DeBERTa Concept Guided Residual Augmentation Transformer for Theologically Islamic Understanding

Accurate QA over classical Islamic texts remains challenging due to domain specific semantics, long context dependencies, and concept sensitive reasoning. Therefore, a new CGRA DeBERTa, a concept guided residual domain augmentation transformer framework, is proposed that enhances theological QA over Hadith corpora. The CGRA DeBERTa builds on a customized DeBERTa transformer backbone with lightweight LoRA based adaptations and a residual concept aware gating mechanism. The customized DeBERTa embedding block learns global and positional context, while Concept Guided Residual Blocks incorporate theological priors from a curated Islamic Concept Dictionary of 12 core terms. Moreover, the Concept Gating Mechanism selectively amplifies semantically critical tokens via importance weighted attention, applying differential scaling from 1.04 to 3.00. This design preserves contextual integrity, strengthens domain-specific semantic representations, and enables accurate, efficient span extraction while maintaining computational efficiency. This paper reports the results of training CGRA using a specially constructed dataset of 42591 QA pairs from the text of Sahih alBukhari and Sahih Muslim. While BERT achieved an EM score of 75.87 and DeBERTa one of 89.77, our model scored 97.85 and thus surpassed them by 8.08 on an absolute scale, all while adding approximately 8 inference overhead due to parameter efficient gating. The qualitative evaluation noted better extraction and discrimination and theological precision. This study presents Hadith QA systems that are efficient, interpretable, and accurate and that scale provide educational materials with necessary theological nuance.

翻译：针对古典伊斯兰文本的精确问答任务，由于领域特定的语义、长上下文依赖以及概念敏感性推理而持续面临挑战。为此，本文提出了一种新的CGRA-DeBERTa模型，即一种概念引导的残差领域增强Transformer框架，以增强针对圣训（Hadith）语料库的神学问答能力。CGRA-DeBERTa基于定制的DeBERTa Transformer主干网络构建，结合了轻量级的基于LoRA的适配机制以及一个残差概念感知门控机制。定制的DeBERTa嵌入块学习全局和位置上下文，而概念引导残差块则从一个包含12个核心术语的精选伊斯兰概念词典中融入神学先验知识。此外，概念门控机制通过重要性加权注意力选择性地增强语义关键标记，应用从1.04到3.00的差异化缩放。该设计保持了上下文完整性，强化了领域特定的语义表示，并在保持计算效率的同时实现了精确、高效的跨度提取。本文报告了使用专门构建的数据集（包含来自《布哈里圣训实录》和《穆斯林圣训实录》文本的42,591个问答对）训练CGRA的结果。虽然BERT的EM得分为75.87，DeBERTa为89.77，但我们的模型取得了97.85的得分，从而在绝对尺度上超越了它们8.08分，同时由于参数高效的门控机制仅增加了约8%的推理开销。定性评估表明，模型在提取与判别能力以及神学精确性方面表现更优。本研究提出的圣训问答系统高效、可解释且准确，能够为教育材料提供具有必要神学细微差别的规模化支持。