The Image Captioning (IC) technique is widely used to describe images in natural language. Recently, some IC system testing methods have been proposed. However, these methods still rely on pre-annotated information and hence cannot really alleviate the oracle problem in testing. Besides, their method artificially manipulates objects, which may generate unreal images as test cases and thus lead to less meaningful testing results. Thirdly, existing methods have various requirements on the eligibility of source test cases, and hence cannot fully utilize the given images to perform testing. To tackle these issues, in this paper, we propose REIC to perform metamorphic testing for IC systems with some image-level reduction transformations like image cropping and stretching. Instead of relying on the pre-annotated information, REIC uses a localization method to align objects in the caption with corresponding objects in the image, and checks whether each object is correctly described or deleted in the caption after transformation. With the image-level reduction transformations, REIC does not artificially manipulate any objects and hence can avoid generating unreal follow-up images. Besides, it eliminates the requirement on the eligibility of source test cases in the metamorphic transformation process, as well as decreases the ambiguity and boosts the diversity among the follow-up test cases, which consequently enables testing to be performed on any test image and reveals more distinct valid violations. We employ REIC to test five popular IC systems. The results demonstrate that REIC can sufficiently leverage the provided test images to generate follow-up cases of good reality, and effectively detect a great number of distinct violations, without the need for any pre-annotated information.
翻译:图像描述技术被广泛应用于以自然语言描述图像内容。近年来,已有一些针对图像描述系统的测试方法被提出。然而,这些方法仍依赖于预标注信息,因而无法真正缓解测试中的预言问题。此外,现有方法通过人工操纵图像中的对象,可能生成不真实的测试用例,从而导致测试结果意义有限。第三,现有方法对源测试用例的适用性存在多种限制,因而无法充分利用给定图像进行测试。为解决这些问题,本文提出REIC方法,通过图像裁剪、拉伸等图像级缩减变换对图像描述系统进行蜕变测试。REIC不依赖预标注信息,而是采用定位方法将描述文本中的对象与图像中的对应区域对齐,并检查变换后每个对象在描述中是否被正确描述或删除。通过图像级缩减变换,REIC无需人工操纵任何对象,从而避免生成不真实的衍生图像。此外,该方法消除了蜕变变换过程中对源测试用例适用性的要求,同时降低了衍生测试用例的模糊性并提升了其多样性,使得可在任意测试图像上执行测试,并揭示更多不同类型的有效违规。我们使用REIC对五个主流图像描述系统进行了测试。结果表明,REIC能够充分利用提供的测试图像生成真实性良好的衍生用例,并在无需任何预标注信息的情况下,有效检测出大量不同类型的违规行为。