In high-dimensional time-series analysis, it is essential to have a set of key factors (namely, the style factors) that explain the change of the observed variable. For example, volatility modeling in finance relies on a set of risk factors, and climate change studies in climatology rely on a set of causal factors. The ideal low-dimensional style factors should balance significance (with high explanatory power) and stability (consistent, no significant fluctuations). However, previous supervised and unsupervised feature extraction methods can hardly address the tradeoff. In this paper, we propose Style Miner, a reinforcement learning method to generate style factors. We first formulate the problem as a Constrained Markov Decision Process with explanatory power as the return and stability as the constraint. Then, we design fine-grained immediate rewards and costs and use a Lagrangian heuristic to balance them adaptively. Experiments on real-world financial data sets show that Style Miner outperforms existing learning-based methods by a large margin and achieves a relatively 10% gain in R-squared explanatory power compared to the industry-renowned factors proposed by human experts.
翻译:摘要:在高维时间序列分析中,拥有一组关键因子(即风格因子)以解释观测变量的变化至关重要。例如,金融中的波动率建模依赖于一组风险因子,气候学中的气候变化研究依赖于一组因果因子。理想的低维风格因子应在显著性(高解释能力)和稳定性(一致且无明显波动)之间取得平衡。然而,先前有监督和无监督的特征提取方法难以应对这一权衡。本文提出Style Miner——一种生成风格因子的强化学习方法。我们首先将该问题建模为约束马尔可夫决策过程,其中解释能力作为回报,稳定性作为约束。随后,我们设计了精细的即时奖励与成本,并采用拉格朗日启发式方法自适应地平衡两者。在真实金融数据集上的实验表明,Style Miner显著优于现有基于学习的方法,并且与人类专家提出的业界知名因子相比,其R平方解释能力相对提升了10%。