The goal of survey design is often to minimize the errors associated with inference: the total of bias and variance. Random surveys are common because they allow the use of theoretically unbiased estimators. In practice however, such design-based approaches are often unable to account for logistical or budgetary constraints. Thus, they may result in samples that are logistically inefficient, or infeasible to implement. Various balancing and optimal sampling techniques have been proposed to improve the statistical efficiency of such designs, but few models have attempted to explicitly incorporate logistical and financial constraints. We introduce a mixed integer linear program (MILP) for optimal sampling design, capable of capturing a variety of constraints and a wide class of Bayesian regression models. We demonstrate the use of our model on three spatial sampling problems of increasing complexity, including the real logistics of the US Forest Service Forest Inventory and Analysis survey of Tanana, Alaska. Our methodological contribution to survey design is significant because the proposed modeling framework makes it possible to generate high-quality sampling designs and inferences while satisfying practical constraints defined by the user. The technical novelty of the method is the explicit integration of Bayesian statistical models in combinatorial optimization. This integration might allow a paradigm shift in spatial sampling under constrained budgets or logistics.
翻译:调查设计的目标通常是最小化与推断相关的误差:偏差与方差的总和。随机调查因其能够使用理论上无偏的估计量而普遍存在。然而,在实际中,这类基于设计的方法往往无法考虑逻辑或预算约束。因此,它们可能导致样本在逻辑上效率低下,或难以实施。已有多种平衡和最优抽样技术被提出以提高此类设计的统计效率,但很少有模型尝试明确纳入逻辑和财务约束。我们提出了一种用于最优抽样设计的混合整数线性规划(MILP),该模型能够捕捉多种约束和广泛的贝叶斯回归模型类别。我们在三个复杂度递增的空间抽样问题上展示了模型的应用,包括美国林务局在阿拉斯加塔纳纳地区森林资源清查与分析调查的真实逻辑场景。我们的方法对调查设计具有重要贡献,因为所提出的建模框架能够在满足用户定义的实际约束的同时生成高质量的抽样设计和推断。该方法的技术创新在于将贝叶斯统计模型明确集成到组合优化中。这种集成可能推动预算或逻辑受限条件下空间抽样领域的范式转变。