Measuring Consistency in Text-based Financial Forecasting Models

Financial forecasting has been an important and active area of machine learning research, as even the most modest advantage in predictive accuracy can be parlayed into significant financial gains. Recent advances in natural language processing (NLP) bring the opportunity to leverage textual data, such as earnings reports of publicly traded companies, to predict the return rate for an asset. However, when dealing with such a sensitive task, the consistency of models -- their invariance under meaning-preserving alternations in input -- is a crucial property for building user trust. Despite this, current financial forecasting methods do not consider consistency. To address this problem, we propose FinTrust, an evaluation tool that assesses logical consistency in financial text. Using FinTrust, we show that the consistency of state-of-the-art NLP models for financial forecasting is poor. Our analysis of the performance degradation caused by meaning-preserving alternations suggests that current text-based methods are not suitable for robustly predicting market information. All resources are available at https://github.com/yingpengma/fintrust.

翻译：金融预测一直是机器学习研究中重要且活跃的领域，因为即使在预测准确性上获得最微小的优势，也能转化为显著的经济收益。自然语言处理（NLP）的最新进展使得利用上市公司财报等文本数据来预测资产回报率成为可能。然而，在处理这类敏感任务时，模型的一致性——即输入在保持语义不变的情况下进行变换时的不变性——是建立用户信任的关键属性。尽管如此，当前的金融预测方法并未考虑一致性。为解决这一问题，我们提出了FinTrust，一个评估金融文本逻辑一致性的工具。利用FinTrust，我们发现用于金融预测的最先进的NLP模型的一致性较差。我们对保持语义不变的变换所导致的性能下降的分析表明，当前基于文本的方法不适合稳健地预测市场信息。所有资源均可通过https://github.com/yingpengma/fintrust获取。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【2023新书】使用Python进行统计和数据可视化，554页pdf

专知会员服务

130+阅读 · 2023年1月29日