In this paper, we tackle the dynamic mean-variance portfolio selection problem in a {\it model-free} manner, based on (generative) diffusion models. We propose using data sampled from the real model $\mathcal P$ (which is unknown) with limited size to train a generative model $\mathcal Q$ (from which we can easily and adequately sample). With adaptive training and sampling methods that are tailor-made for time series data, we obtain quantification bounds between $\mathcal P$ and $\mathcal Q$ in terms of the adapted Wasserstein metric $\mathcal A W_2$. Importantly, the proposed adapted sampling method also facilitates {\it conditional sampling}. In the second part of this paper, we provide the stability of the mean-variance portfolio optimization problems in $\mathcal A W _2$. Then, combined with the error bounds and the stability result, we propose a policy gradient algorithm based on the generative environment, in which our innovative adapted sampling method provides approximate scenario generators. We illustrate the performance of our algorithm on both simulated and real data. For real data, the algorithm based on the generative environment produces portfolios that beat several important baselines, including the Markowitz portfolio, the equal weight (naive) portfolio, and S\&P 500.
翻译:本文以(生成式)扩散模型为基础,以无模型方式处理动态均值-方差投资组合选择问题。我们提出利用从真实模型$\mathcal P$(未知)中采样的有限规模数据来训练生成模型$\mathcal Q$(从中可便捷且充分地采样)。通过为时间序列数据量身定制的自适应训练与采样方法,我们获得了$\mathcal P$与$\mathcal Q$在自适应Wasserstein度量$\mathcal A W_2$下的量化边界。值得注意的是,所提出的自适应采样方法还支持条件采样。在本文第二部分,我们证明了均值-方差投资组合优化问题在$\mathcal A W_2$度量下的稳定性。结合误差边界与稳定性结果,我们提出一种基于生成环境的策略梯度算法,其中创新的自适应采样方法可提供近似情景生成器。我们在模拟数据与真实数据上验证了算法的性能。对于真实数据,基于生成环境的算法所构建的投资组合表现优于多个重要基准,包括马科维茨投资组合、等权重(朴素)投资组合以及标普500指数。