Statistical modelling strategy is the key for success in data analysis. The trade-off between flexibility and parsimony plays a vital role in statistical modelling. In clustered data analysis, in order to account for the heterogeneity between the clusters, certain flexibility is necessary in the modelling, yet parsimony is also needed to guard against the complexity and account for the homogeneity among the clusters. In this paper, we propose a flexible and parsimonious modelling strategy for clustered data analysis. The strategy strikes a nice balance between flexibility and parsimony, and accounts for both heterogeneity and homogeneity well among the clusters, which often come with strong practical meanings. In fact, its usefulness has gone beyond clustered data analysis, it also sheds promising lights on transfer learning. An estimation procedure is developed for the unknowns in the resulting model, and asymptotic properties of the estimators are established. Intensive simulation studies are conducted to demonstrate how well the proposed methods work, and a real data analysis is also presented to illustrate how to apply the modelling strategy and associated estimation procedure to answer some real problems arising from real life.
翻译:统计建模策略是数据分析成功的关键。灵活性与简约性之间的权衡在统计建模中起着至关重要的作用。在聚类数据分析中,为了解释不同聚类之间的异质性,建模时需要一定的灵活性;同时,也需要简约性来避免模型过于复杂,并体现聚类内部的同质性。本文提出了一种面向聚类数据分析的灵活且简约的建模策略。该策略巧妙平衡了灵活性与简约性,并很好地兼顾了聚类间的异质性与同质性——这些属性通常具有重要的实际意义。事实上,该策略的应用已超越聚类数据分析领域,对迁移学习也具有重要启发意义。我们为所得模型中的未知参数开发了估计程序,并建立了估计量的渐近性质。通过大量模拟研究展示了所提方法的有效性,同时通过实际数据分析说明了如何应用该建模策略及其相关估计程序来解决现实生活中的实际问题。