Studies in circadian biology often use trigonometric regression to model phenomena over time. Ideally, protocols in these studies would collect samples at evenly distributed and equally spaced time points over a 24 hour period. This sample collection protocol is known as an equispaced design, which is considered the optimal experimental design for trigonometric regression under multiple statistical criteria. However, implementing equispaced designs in studies involving individuals is logistically challenging, and failure to employ an equispaced design could cause a loss of statistical power when performing hypothesis tests with an estimated model. This paper is motivated by the potential loss of statistical power during hypothesis testing, and considers a weighted trigonometric regression as a remedy. Specifically, the weights for this regression are the normalized reciprocals of estimates derived from a kernel density estimator for sample collection time, which inflates the weight of samples collected at underrepresented time points. A search procedure is also introduced to identify the concentration hyperparameter for kernel density estimation that maximizes the Hessian of weighted squared loss, which relates to both maximizing the $D$-optimality criterion from experimental design literature and minimizing the generalized variance. Simulation studies consistently demonstrate that this weighted regression mitigates variability in inferences produced by an estimated model. Illustrations with three real circadian biology data sets further indicate that this weighted regression consistently yields larger test statistics than its unweighted counterpart for first-order trigonometric regression, or cosinor regression, which is prevalent in circadian biology studies.
翻译:昼夜节律生物学研究常使用三角回归对随时间变化的生物学现象进行建模。理想情况下,此类研究的实验方案应在24小时内以均匀分布且等间距的时间点采集样本。这种样本采集方案被称为等间距设计,在多项统计准则下被认为是三角回归的最优实验设计。然而,在涉及个体的研究中实施等间距设计存在后勤挑战,若未能采用等间距设计,使用估计模型进行假设检验时可能导致统计效力的损失。本文针对假设检验中统计效力潜在损失的问题,提出将加权三角回归作为补救措施。具体而言,该回归的权重是基于样本采集时间的核密度估计量所得估计值的归一化倒数,这会增加在未充分代表时间点采集样本的权重。本文还引入一种搜索程序,用于确定核密度估计的浓度超参数,该超参数可最大化加权平方损失的Hessian矩阵,这既与实验设计文献中最大化$D$-最优性准则相关,也与最小化广义方差相关。模拟研究一致表明,该加权回归方法能有效降低估计模型产生的推断变异性。通过三个真实昼夜节律生物学数据集的实例分析进一步表明,对于一阶三角回归(即昼夜节律生物学研究中广泛应用的余弦回归),该加权回归方法始终比其未加权版本产生更大的检验统计量。