Debt collection is a critical function within the banking, financial services, and insurance (BFSI) sector, relying heavily on large-scale human-to-human conversational interactions conducted primarily in Vietnamese contact centers. These conversations involve informal spoken language, emotional variability, and complex domain-specific reasoning, which pose significant challenges for traditional natural language processing systems. This paper introduces Credit C-GPT, a domain-specialized large language model with seven billion parameters, fine-tuned for conversational understanding in Vietnamese debt collection scenarios. The proposed model integrates multiple conversational intelligence tasks, including dialogue understanding, sentiment recognition, intent detection, call stage classification, and structured slot-value extraction, within a single reasoning-based framework. We describe the data construction process, annotation strategy, and training methodology, and evaluate the model on proprietary human-annotated datasets. Experimental results show consistent improvements over traditional pipeline-based approaches, indicating that domain-specialized conversational language models provide a scalable and privacy-aware solution for real-time assistance and post-call analytics in enterprise contact centers.
翻译:债务催收是银行、金融服务和保险(BFSI)领域的一项关键职能,其严重依赖于主要在越南呼叫中心进行的大规模人际对话交互。这些对话涉及非正式口语、情绪波动以及复杂的领域特定推理,对传统自然语言处理系统构成了重大挑战。本文介绍了Credit C-GPT,这是一个拥有七十亿参数的专业领域大语言模型,专门针对越南债务催收场景下的对话理解进行了微调。所提出的模型在一个基于推理的统一框架内,集成了多项对话智能任务,包括对话理解、情感识别、意图检测、通话阶段分类以及结构化槽值提取。我们描述了数据构建过程、标注策略和训练方法,并在专有的人工标注数据集上对模型进行了评估。实验结果表明,相较于传统的基于流水线的方法,该模型取得了持续的改进,这表明专业领域的对话语言模型为企业呼叫中心的实时辅助和通话后分析提供了一个可扩展且注重隐私的解决方案。