Analyzing trends across industries is critical to maintaining a healthy and stable economy. Previous research has mainly analyzed official statistics, which are more accurate but not necessarily real-time. In this paper, we propose a method for analyzing industry trends using stock market data. The difficulty of this task is that the raw data is relatively noisy, which affects the accuracy of statistical analysis. In addition, textual data for industry analysis needs to be better understood through language models. For this reason, we introduce the method of industry trend analysis from two perspectives of explicit analysis and implicit analysis. For the explicit analysis, we introduce a hierarchical data (industry and listed company) analysis method to reduce the impact of noise. For implicit analysis, we further pre-train GPT-2 to analyze industry trends with current affairs background as input, making full use of the knowledge learned in the pre-training corpus. We conduct experiments based on the proposed method and achieve good industry trend analysis results.
翻译:分析跨行业趋势对于维持健康稳定的经济至关重要。以往的研究主要分析官方统计数据,这些数据较为准确但未必具有实时性。本文提出了一种利用股票市场数据分析行业趋势的方法。该任务的难点在于原始数据存在较大噪声,影响了统计分析准确性。此外,行业分析所需的文本数据需要通过语言模型进行更深入的理解。为此,我们从显性分析和隐性分析两个维度引入行业趋势分析方法。在显性分析方面,我们引入分层数据(行业与上市公司)分析方法以降低噪声影响;在隐性分析方面,我们进一步预训练GPT-2,以时事背景作为输入进行行业趋势分析,充分运用预训练语料中习得的知识。基于所提方法进行的实验取得了良好的行业趋势分析效果。