A model among many may only be best under certain states of the world. Switching from a model to another can also be costly. Finding a procedure to dynamically choose a model in these circumstances requires to solve a complex estimation procedure and a dynamic programming problem. A Reinforcement learning algorithm is used to approximate and estimate from the data the optimal solution to this dynamic programming problem. The algorithm is shown to consistently estimate the optimal policy that may choose different models based on a set of covariates. A typical example is the one of switching between different portfolio models under rebalancing costs, using macroeconomic information. Using a set of macroeconomic variables and price data, an empirical application to the aforementioned portfolio problem shows superior performance to choosing the best portfolio model with hindsight.
翻译:许多模型可能只在世界的特定状态下表现最优。在不同模型之间切换也可能产生成本。在这种情境下寻找动态选择模型的程序,需要求解复杂的估计过程与动态规划问题。本文采用强化学习算法,从数据中近似并估计该动态规划问题的最优解。理论证明该算法能够一致地估计出基于一组协变量选择不同模型的最优策略。典型例子是在再平衡成本约束下,利用宏观经济信息在不同投资组合模型之间切换。基于一组宏观经济变量与价格数据,针对上述投资组合问题的实证研究表明,该策略的表现优于事后选择最优投资组合模型的方法。