Several novel statistical methods have been developed to estimate large integrated volatility matrices based on high-frequency financial data. To investigate their asymptotic behaviors, they require a sub-Gaussian or finite high-order moment assumption for observed log-returns, which cannot account for the heavy-tail phenomenon of stock-returns. Recently, a robust estimator was developed to handle heavy-tailed distributions with some bounded fourth-moment assumption. However, we often observe that log-returns have heavier tail distribution than the finite fourth-moment and that the degrees of heaviness of tails are heterogeneous across asset and over time. In this paper, to deal with the heterogeneous heavy-tailed distributions, we develop an adaptive robust integrated volatility estimator that employs pre-averaging and truncation schemes based on jump-diffusion processes. We call this an adaptive robust pre-averaging realized volatility (ARP) estimator. We show that the ARP estimator has a sub-Weibull tail concentration with only finite 2$\alpha$-th moments for any $\alpha>1$. In addition, we establish matching upper and lower bounds to show that the ARP estimation procedure is optimal. To estimate large integrated volatility matrices using the approximate factor model, the ARP estimator is further regularized using the principal orthogonal complement thresholding (POET) method. The numerical study is conducted to check the finite sample performance of the ARP estimator.
翻译:针对基于高频金融数据的大规模积分波动率矩阵估计问题,已有若干新颖统计方法被提出。为探究其渐近性质,这些方法通常要求观测对数收益率满足次高斯分布或有限高阶矩假设,这无法解释股票收益率的厚尾现象。近期,有学者在有限四阶矩假设下开发出可处理厚尾分布的稳健估计量。然而,实证中常发现对数收益率的尾部分布比有限四阶矩假设更重,且不同资产、不同时段的尾部厚重程度具有异质性。为应对这种异质性厚尾分布,本文基于跳跃扩散过程,利用预平均与截断机制开发出一种自适应稳健积分波动率估计量,称之为自适应稳健预平均已实现波动率(ARP)估计量。我们证明,在任意α>1的条件下,仅需存在有限2α阶矩,ARP估计量即可获得次威布尔尾部集中性。此外,我们通过建立匹配的上下界证明ARP估计方法具有最优性。在采用近似因子模型估计大规模积分波动率矩阵时,进一步利用主正交补阈值(POET)方法对ARP估计量进行正则化处理。数值实验检验了ARP估计量的有限样本性能。