The validation of a data-driven model is the process of assessing the model's ability to generalize to new, unseen data in the population of interest. This paper proposes a set of general rules for model validation. These rules are designed to help practitioners create reliable validation plans and report their results transparently. While no validation scheme is flawless, these rules can help practitioners ensure their strategy is sufficient for practical use, openly discuss any limitations of their validation strategy, and report clear, comparable performance metrics.
翻译:数据驱动模型的验证是评估模型在目标群体中泛化至新的、未见数据能力的过程。本文提出了一套通用的模型验证规则。这些规则旨在帮助实践者制定可靠的验证方案并透明地报告结果。尽管不存在完美的验证方案,但这些规则可帮助实践者确保其策略在实际应用中足够充分,公开讨论验证策略的任何局限性,并报告清晰、可比较的性能指标。