We use empirical Bayes (EB) to mine for out-of-sample returns among 73,108 long-short strategies constructed from accounting ratios, past returns, and ticker symbols. EB predicts returns are concentrated in accounting and past return strategies, small stocks, and pre-2004 samples. The cross-section of out-of-sample return lines up closely with EB predictions. Data-mined portfolios have mean returns comparable with published portfolios, but the data-mined returns are arguably free of data mining bias. In contrast, controlling for multiple testing following Harvey, Liu, and Zhu (2016) misses the vast majority of returns. This "high-throughput asset pricing" provides an evidence-based solution for data mining bias.
翻译:我们采用经验贝叶斯方法(Empirical Bayes, EB),从基于会计比率、历史收益率和股票代码构建的73,108种多空策略中挖掘样本外收益。EB预测显示,收益率集中在会计和历史收益率策略、小盘股以及2004年前的样本中。样本外收益率的截面分布与EB预测高度吻合。数据挖掘所得投资组合的平均收益率与已发表组合相当,但前者可被认为免受数据挖掘偏差影响。相比之下,采用Harvey、Liu和Zhu(2016)提出的多重检验控制方法会遗漏绝大多数收益率。这种"高通量资产定价"为数据挖掘偏差提供了基于证据的解决方案。