Supersaturated designs investigate more factors than there are runs, and are often constructed under a criterion measuring a design's proximity to an unattainable orthogonal design. The most popular analysis identifies active factors by inspecting the solution path of a penalized estimator, such as the lasso. Recent criteria encouraging positive correlations between factors have been shown to produce designs with more definitive solution paths so long as the active factors have positive effects. Two open problems affecting the understanding and practicality of supersaturated designs are: (1) do optimal designs under existing criteria maximize support recovery probability across an estimator's solution path, and (2) why do designs with positively correlated columns produce more definitive solution paths when the active factors have positive sign effects? To answer these questions, we develop criteria maximizing the lasso's sign recovery probability. We prove that an orthogonal design is an ideal structure when the signs of the active factors are unknown, and a design constant small, positive correlations is ideal when the signs are assumed known. A computationally-efficient design search algorithm is proposed that first filters through optimal designs under new heuristic criteria to select the one that maximizes the lasso sign recovery probability.
翻译:超饱和设计研究的因子数多于实验次数,通常根据衡量设计接近不可达正交设计的准则来构造。最常用的分析方法通过检查惩罚估计量(如Lasso)的解路径来识别活跃因子。最新研究表明,若活跃因子具有正效应,促进因子间正相关性的准则可生成解路径更明确的设计。影响超饱和设计理解与实用性的两个未解决问题是:(1)现有准则下的最优设计是否能在估计量解路径上最大化支持恢复概率;(2)当活跃因子具有正符号效应时,为何列间正相关的设计能产生更明确的解路径?为解答这些问题,我们开发了最大化Lasso符号恢复概率的准则。证明当活跃因子符号未知时,正交设计为理想结构;而假设符号已知时,恒定小正相关设计为理想结构。提出一种计算高效的设计搜索算法,该算法先通过新启发式准则筛选最优设计,再从中选择最大化Lasso符号恢复概率的设计。