Group number selection is a key problem for group panel data modeling. In this work, we develop a cross-validation (CV) method to tackle this problem. Specifically, we split the panel data into two data folds on the time span with a buffer zone, with group structure preserved for individuals. We first estimate the group memberships and parameters on one data fold, then plug in the estimates and utilize the other data fold to evaluate a designed criterion. Subsequently, the group number is estimated by minimizing the average criterion across all data folds. The proposed CV method has two advantages compared to existing approaches. First, the method is totally data-driven; thus no further model-specific tuning parameters are involved. Second, the method can be flexibly applied to a wide range of panel data models. Theoretically, we establish the estimation consistency by taking advantage of the optimization process on the training data fold. Experiments are carried out with a variety of synthetic datasets and panel models to further illustrate the advantages of the proposed method. Lastly, the CV method is employed to analyze the heterogeneous patterns of stock volatilities in the Chinese stock market during the 2008 financial crisis.
翻译:暂无翻译