Several novel statistical methods have been developed to estimate large integrated volatility matrices based on high-frequency financial data. To investigate their asymptotic behaviors, they require a sub-Gaussian or finite high-order moment assumption for observed log-returns, which cannot account for the heavy-tail phenomenon of stock returns. Recently, a robust estimator was developed to handle heavy-tailed distributions with some bounded fourth-moment assumption. However, we often observe that log-returns have heavier tail distribution than the finite fourth-moment and that the degrees of heaviness of tails are heterogeneous across asset and over time. In this paper, to deal with the heterogeneous heavy-tailed distributions, we develop an adaptive robust integrated volatility estimator that employs pre-averaging and truncation schemes based on jump-diffusion processes. We call this an adaptive robust pre-averaging realized volatility (ARP) estimator. We show that the ARP estimator has a sub-Weibull tail concentration with only finite 2$\alpha$-th moments for any $\alpha>1$. In addition, we establish matching upper and lower bounds to show that the ARP estimation procedure is optimal. To estimate large integrated volatility matrices using the approximate factor model, the ARP estimator is further regularized using the principal orthogonal complement thresholding (POET) method. The numerical study is conducted to check the finite sample performance of the ARP estimator.
翻译:近年来,针对高频金融数据的大规模积分波动率矩阵估计问题,已发展出若干新型统计方法。为研究其渐近性质,这些方法通常假设观测对数收益服从次高斯分布或具有有限高阶矩,但这种假设无法解释股票收益的厚尾现象。近期虽有学者基于有限四阶矩假设构建了稳健估计量以处理厚尾分布,但实际观测中常见对数收益的尾部分布较有限四阶矩假设更重,且不同资产与不同时间点的尾部厚度呈现异质性。本文针对这种异质性厚尾分布问题,基于跳扩散过程提出一种采用预平均与截断方案的自适应稳健积分波动率估计量,并将其命名为自适应稳健预平均已实现波动率(ARP)估计量。我们证明,当任意α>1时,ARP估计量在仅需2α阶矩有限的条件下即可呈现次韦伯尾部集中性。此外,通过匹配上下界论证了ARP估计方法的最优性。为利用近似因子模型估计大规模积分波动率矩阵,进一步采用主正交补阈值(POET)方法对ARP估计量进行正则化处理。数值研究验证了ARP估计量的有限样本性能。