Studies of alcohol and drug use are often interested in the number of days that people use the substance of interest over an interval, such as 28 days before a survey date. Although count models are often used for this purpose, they are not strictly appropriate for this type of data because the response variable is bounded above. Furthermore, if some peoples' substance use behaviors are characterized by various weekly patterns of use, summaries of substance days-of-use used over longer periods can exhibit multiple modes. These characteristics of substance days-of-use data are not easily fitted with conventional parametric model families. We propose a continuation ratio ordinal model for substance days-of-use data. Instead of grouping the set of possible response values into a small set of ordinal categories, each possible value is assigned its own category. This allows the exact numeric distribution implied by the predicted ordinal response to be recovered. We demonstrate the proposed model using survey data reporting days of alcohol use over 28-day intervals. We show the continuation ratio model is better able to capture the complexity in the drinking days dataset compared to binomial, hurdle-negative binomial and beta-binomial models.
翻译:酒精与药物使用研究常关注个体在特定时间段内(如调查日前28天)使用目标物质的天数。尽管计数模型常被用于此类分析,但因其响应变量存在上限,严格而言并不适用于此类数据。此外,若部分人群的物质使用行为呈现不同周度模式,较长周期内的物质使用天数汇总数据可能呈现多峰分布。传统参数模型族难以直观拟合物质使用天数数据的这些特征。本文提出一种针对物质使用天数数据的连续比序数模型:不为响应变量取值集合预设少量序数类别,而是为每个可能值单独设定一个类别,从而可完全还原预测序数响应所隐含的准确数值分布。我们利用报告28日内饮酒天数的调查数据验证该模型,结果表明,相较于二项分布、障碍负二项分布及贝塔-二项分布模型,连续比模型能更有效地捕捉饮酒天数数据中的复杂结构。