The optimum sample allocation in stratified sampling is one of the basic issues of survey methodology. It is a procedure of dividing the overall sample size into strata sample sizes in such a way that for given sampling designs in strata the variance of the stratified $\pi$ estimator of the population total (or mean) for a given study variable assumes its minimum. In this work, we consider the optimum allocation of a sample, under lower and upper bounds imposed jointly on sample sizes in strata. We are concerned with the variance function of some generic form that, in particular, covers the case of the simple random sampling without replacement in strata. The goal of this paper is twofold. First, we establish (using the Karush-Kuhn-Tucker conditions) a generic form of the optimal solution, the so-called optimality conditions. Second, based on the established optimality conditions, we derive an efficient recursive algorithm, named RNABOX, which solves the allocation problem under study. The RNABOX can be viewed as a generalization of the classical recursive Neyman allocation algorithm, a popular tool for optimum allocation when only upper bounds are imposed on sample strata-sizes. We implement RNABOX in R as a part of our package stratallo which is available from the Comprehensive R Archive Network (CRAN) repository.
翻译:分层抽样中的最优样本分配是调查方法论的基本问题之一。该过程将总体样本量分配至各层样本量,使得在给定分层抽样设计下,研究变量总体总值(或均值)的分层π估计量的方差达到最小。本文研究在同时施加各层样本量上下界约束条件下的最优样本分配问题。我们关注具有某种通用形式的方差函数,该函数特别涵盖了分层无放回简单随机抽样情形。本文目标有二:首先,利用Karush-Kuhn-Tucker条件建立最优解的通用形式,即所谓的最优性条件;其次,基于已建立的最优性条件,推导出一种高效的递归算法——RNABOX,用于求解所研究的分配问题。RNABOX可视为经典递归内曼分配算法的推广,后者是仅对分层样本量施加上限约束时常用的最优分配工具。我们在R语言中实现RNABOX作为stratallo软件包的一部分,该软件包可从CRAN存储库获取。