Reducing the computational time to process large data sets in Data Envelopment Analysis (DEA) is the objective of many studies. Contributions include fundamentally innovative procedures, new or improved preprocessors, and hybridization between - and among - all these. Ultimately, new contributions are made when the number and size of the LPs solved is somehow reduced. This paper provides a comprehensive analysis and comparison of two competing procedures to process DEA data sets: BuildHull and Enhanced Hierarchical Decomposition (EHD). A common ground for comparison is made by examining their sequential implementations, applying to both the same preprocessors - when permitted - on a suite of data sets widely employed in the computational DEA literature. In addition to reporting on execution time, we discuss how the data characteristics affect performance and we introduce using the number and size of the LPs solved to better understand performances and explain differences. Our experiments show that the dominance of BuildHull can be substantial in large-scale and high-density datasets. Comparing and explaining performance based on the number and size of LPS lays the groundwork for a comparison of the parallel implementations of procedures BuildHull and EHD.
翻译:降低数据包络分析(DEA)处理大规模数据集的计算时间是众多研究的目标。相关贡献包括根本性创新程序、新型或改进的预处理器,以及这些方法之间与内部的混合策略。最终,当所求解线性规划(LP)问题的数量和规模以某种方式减少时,便产生了新的贡献。本文对两种处理DEA数据集的竞争性程序——BuildHull与增强型分层分解(EHD)——进行了全面分析与比较。通过考察它们的串行实现,在计算DEA文献广泛采用的一系列数据集上对两者应用相同的预处理器(在允许的情况下),建立了统一的比较基准。除记录执行时间外,我们讨论了数据特征如何影响性能,并引入以所求解LP问题的数量和规模作为指标来深入理解性能表现并解释差异。实验表明,BuildHull在大规模高密度数据集中的优势可能非常显著。基于LP问题数量与规模的性能比较与解释,为BuildHull与EHD程序的并行实现比较奠定了基础。