Experimentation is an intrinsic part of research in artificial intelligence since it allows for collecting quantitative observations, validating hypotheses, and providing evidence for their reformulation. For that reason, experimentation must be coherent with the purposes of the research, properly addressing the relevant questions in each case. Unfortunately, the literature is full of works whose experimentation is neither rigorous nor convincing, oftentimes designed to support prior beliefs rather than answering the relevant research questions. In this paper, we focus on the field of metaheuristic optimization, since it is our main field of work, and it is where we have observed the misconduct that has motivated this letter. Even if we limit the focus of this manuscript to the experimental part of the research, our main goal is to sew the seed of sincere critical assessment of our work, sparking a reflection process both at the individual and the community level. Such a reflection process is too complex and extensive to be tackled as a whole. Therefore, to bring our feet to the ground, we will include in this document our reflections about the role of experimentation in our work, discussing topics such as the use of benchmark instances vs instance generators, or the statistical assessment of empirical results. That is, all the statements included in this document are personal views and opinions, which can be shared by others or not. Certainly, having different points of view is the basis to establish a good discussion process.
翻译:实验是人工智能研究中不可或缺的一部分,因为它可以收集定量观察、验证假设,并为假设的修正提供证据。因此,实验必须与研究目的保持一致,并在每种情况下妥善解决相关的问题。不幸的是,文献中充斥着实验既不严谨也缺乏说服力的研究,这些研究通常旨在支持先入为主的信念,而非回答相关的研究问题。本文聚焦于元启发式优化领域,因为这是我们主要的工作领域,也正是我们在其中观察到的、促成撰写本文的不端行为。尽管我们将本手稿的重点局限于研究的实验部分,但我们的主要目标是播下对自身工作进行真诚批判性评估的种子,激发个人和社区层面的反思过程。这样的反思过程过于复杂和广泛,无法整体应对。因此,为了脚踏实地,我们在本文中纳入了关于实验在自身工作中作用的思考,讨论了诸如使用基准实例与实例生成器,或对实证结果进行统计评估等话题。也就是说,本文中包含的所有陈述均为个人观点和意见,可能与他人共享,也可能不共享。当然,持有不同的观点是建立良好讨论过程的基础。