Given its status as a classic problem and its importance to both theoreticians and practitioners, edit distance provides an excellent lens through which to understand how the theoretical analysis of algorithms impacts practical implementations. From an applied perspective, the goals of theoretical analysis are to predict the empirical performance of an algorithm and to serve as a yardstick to design novel algorithms that perform well in practice. In this paper, we systematically survey the types of theoretical analysis techniques that have been applied to edit distance and evaluate the extent to which each one has achieved these two goals. These techniques include traditional worst-case analysis, worst-case analysis parametrized by edit distance or entropy or compressibility, average-case analysis, semi-random models, and advice-based models. We find that the track record is mixed. On one hand, two algorithms widely used in practice have been born out of theoretical analysis and their empirical performance is captured well by theoretical predictions. On the other hand, all the algorithms developed using theoretical analysis as a yardstick since then have not had any practical relevance. We conclude by discussing the remaining open problems and how they can be tackled.
翻译:作为经典问题及对理论研究者与实践者的重要性,编辑距离为理解算法理论分析如何影响实际实现提供了极佳视角。从应用角度来看,理论分析的目标是预测算法的实际性能,并作为设计实践中表现优异的新型算法的标尺。本文系统梳理了应用于编辑距离的理论分析技术类型,评估每项技术在达成上述两个目标方面的成效。这些技术包括传统最坏情况分析、以编辑距离/熵/可压缩性为参数的最坏情况分析、平均情况分析、半随机模型及建议型模型。我们发现其成效参差不齐:一方面,实践中广泛使用的两种算法源于理论分析,且其实际性能与理论预测高度吻合;另一方面,此后以理论分析为标尺开发的所有算法均未产生实际应用价值。最后,我们讨论了尚存的开放问题及其解决路径。