Gene expression levels, hormone secretion, and internal body temperature each oscillate over an approximately 24-hour cycle, or display circadian rhythms. Many circadian biology studies have investigated how these rhythms vary across cohorts, uncovering associations between atypical rhythms and diseases such as cancer, metabolic syndrome, and sleep disorders. A challenge in analyzing circadian biology data is that the oscillation peak and trough times for a phenomenon differ across individuals. If these individual-level differences are not accounted for in trigonometric regression, which is prevalent in circadian biology studies, then estimates of the population-level amplitude parameters can suffer from attenuation bias. This attenuation bias could lead to inaccurate study conclusions. To address attenuation bias, we propose a refined two-stage (RTS) method for trigonometric regression given longitudinal data obtained from each individual participating in a study. In the first stage, the parameters of individual-level models are estimated. In the second stage, transformations of these individual-level estimates are aggregated to produce population-level parameter estimates for inference. Simulation studies show that our RTS method mitigates bias in parameter estimation, obtains greater statistical power, and maintains appropriate type I error control when compared to the standard two-stage (STS) method, which ignores individual-level differences in peak and trough times. The only exception for parameter estimation and statistical power occurs when the oscillation amplitudes are weak relative to random variability in the data and the sample size is small. Illustrations with cortisol level data and heart rate data show that our RTS method obtains larger population-level amplitude parameter estimates and smaller $p$-values for multiple hypothesis tests when compared to the STS method.
翻译:基因表达水平、激素分泌和体内温度均以大约24小时为周期振荡,即呈现昼夜节律。许多昼夜节律生物学研究探讨了这些节律在不同群体间的差异,揭示了异常节律与癌症、代谢综合征及睡眠障碍等疾病间的关联。分析昼夜节律生物学数据的一个挑战在于:不同个体间同一现象的振荡峰值与谷值出现时间存在差异。若在三角回归(昼夜节律研究中普遍采用的方法)中未考虑这些个体层面的差异,则群体水平振幅参数的估计可能遭受衰减偏误。这种衰减偏误可能导致研究结论失准。为应对衰减偏误,我们提出一种基于纵向数据的精细化两阶段(RTS)三角回归方法,数据来源于参与研究的每个个体。在第一阶段,估计个体层面模型的参数;在第二阶段,对这些个体层面估计量进行变换后聚合,以生成用于推断的群体水平参数估计。模拟研究表明:相较于忽略峰值与谷值时间个体差异的标准两阶段(STS)方法,我们的RTS方法能够减轻参数估计偏误,获得更高的统计功效,并维持适当的I类错误控制。仅在数据随机变异较强而振荡振幅较弱且样本量较小时,参数估计与统计功效会出现例外情况。通过皮质醇水平数据与心率数据的实例分析表明:与STS方法相比,我们的RTS方法能获得更大的群体水平振幅参数估计值,并在多重假设检验中产生更小的$p$值。