Large language models (LLMs), including ChatGPT, can extract profitable trading signals from the sentiment in news text. However, backtesting such strategies poses a challenge because LLMs are trained on many years of data, and backtesting produces biased results if the training and backtesting periods overlap. This bias can take two forms: a look-ahead bias, in which the LLM may have specific knowledge of the stock returns that followed a news article, and a distraction effect, in which general knowledge of the companies named interferes with the measurement of a text's sentiment. We investigate these sources of bias through trading strategies driven by the sentiment of financial news headlines. We compare trading performance based on the original headlines with de-biased strategies in which we remove the relevant company's identifiers from the text. In-sample (within the LLM training window), we find, surprisingly, that the anonymized headlines outperform, indicating that the distraction effect has a greater impact than look-ahead bias. This tendency is particularly strong for larger companies--companies about which we expect an LLM to have greater general knowledge. Out-of-sample, look-ahead bias is not a concern but distraction remains possible. Our proposed anonymization procedure is therefore potentially useful in out-of-sample implementation, as well as for de-biased backtesting.
翻译:大型语言模型(LLMs),包括ChatGPT,能够从新闻文本的情感中提取盈利的交易信号。然而,回测此类策略存在挑战,因为LLMs在多年数据上训练,若训练期与回测期重叠,回测将产生偏差结果。这种偏差表现为两种形式:前瞻偏差——LLM可能已掌握新闻文章后续股票收益的具体知识;以及干扰效应——对提及公司的一般知识会干扰文本情感测量的准确性。我们通过基于金融新闻标题情感的交易策略探究这些偏差来源,比较原始标题与去除公司标识符的去偏策略的交易表现。令人惊讶的是,在样本内(LLM训练窗口内),匿名化标题的表现更优,表明干扰效应的影响大于前瞻偏差。这种趋势在大型公司中尤为显著——这些公司被预期LLM拥有更丰富的一般知识。在样本外,前瞻偏差不再是问题,但干扰效应仍可能存在。因此,我们提出的匿名化流程在样本外实施和去偏回测中均具有潜在应用价值。