Credit risk assessment is essential in the financial sector, but has traditionally depended on costly feature-based models that often fail to utilize all available information in raw credit records. This paper introduces LendNova, the first practical automated end-to-end pipeline for credit risk assessment, designed to utilize all available information in raw credit records by leveraging advanced NLP techniques and language models. LendNova transforms risk modeling by operating directly on raw, jargon-heavy credit bureau text using a language model that learns task-relevant representations without manual feature engineering. By automatically capturing patterns and risk signals embedded in the text, it replaces manual preprocessing steps, reducing costs and improving scalability. Evaluation on real-world data further demonstrates its strong potential in accurate and efficient risk assessment. LendNova establishes a baseline for intelligent credit risk agents, demonstrating the feasibility of language models in this domain. It lays the groundwork for future research toward foundation systems that enable more accurate, adaptable, and automated financial decision-making.
翻译:信用风险评估在金融领域至关重要,但传统上依赖于成本高昂的基于特征的模型,这些模型往往无法充分利用原始信用记录中的所有可用信息。本文提出了LendNova,首个实用的端到端自动化信用风险评估流程,旨在通过先进的自然语言处理技术和语言模型,充分利用原始信用记录中的所有可用信息。LendNova通过直接处理原始、术语密集的征信文本,利用语言模型学习任务相关表征,无需人工特征工程,从而改变了风险建模方式。通过自动捕捉文本中嵌入的模式和风险信号,它取代了人工预处理步骤,降低了成本并提高了可扩展性。在真实数据上的评估进一步证明了其在准确高效风险评估方面的强大潜力。LendNova为智能信用风险代理建立了基准,展示了语言模型在该领域的可行性。它为未来构建更准确、适应性更强、自动化程度更高的金融决策基础系统奠定了基础。