Bounded discrete proportions -- counts out of known totals -- present modeling challenges when data exhibit structural zeros, overdispersion, and hierarchical clustering. We develop a Bayesian hierarchical hurdle beta-binomial model with state-varying coefficients that addresses all four features. The framework makes three methodological contributions: (i) it studies cross-margin dependence via a cross-block covariance component and clarifies when and how this parameter is identified through the hierarchical layer rather than the conditional likelihood; (ii) it proposes a Cholesky-based sandwich variance calibration for pseudo-posterior inference under survey weights, guided by a parameter-specific design effect ratio diagnostic; and (iii) it introduces a log-scale marginal effect decomposition for hurdle models that translates regression coefficients into policy-relevant quantities. Applied to 6,785 childcare providers across 51 states from the 2019 National Survey of Early Care and Education, the model reveals a "poverty reversal": poverty reduces enrollment participation yet increases intensity among participants, with the extensive margin accounting for two-thirds of the total effect. Design-calibrated simulation shows that sandwich-corrected intervals substantially improve coverage, reaching 82--88.5% at the 90% nominal level for fixed effects. The R package hurdlebb implements all methods.
翻译:有界离散比例——即已知总量中的计数——当数据呈现结构性零值、过度离散和分层聚类特征时,会带来建模挑战。本文提出了一种具有州级变系数特征的贝叶斯分层跨栏贝塔-二项模型,可同时处理这四类特征。该框架在方法论上作出三点贡献:(i)通过跨区块协方差分量研究跨边际依赖性,并阐明该参数如何通过分层结构(而非条件似然)实现识别;(ii)在调查加权条件下,基于参数特异性设计效应比诊断指标,提出适用于伪后验推断的基于Cholesky分解的三明治方差校正方法;(iii)针对跨栏模型提出对数尺度边际效应分解方法,将回归系数转化为政策相关度量指标。将该模型应用于2019年全国早期保育与教育调查中51个州的6,785家保育机构数据,揭示了“贫困逆转”现象:贫困会降低入学参与率,但会提高参与者的入学强度,其中扩展边际效应占总效应的三分之二。设计校准模拟表明,经三明治校正的置信区间显著提升了覆盖效果,在90%名义水平下固定效应的覆盖率达到82–88.5%。R软件包hurdlebb实现了全部方法。