The Image Captioning (IC) technique is widely used to describe images in natural language. Recently, some IC system testing methods have been proposed. However, these methods still rely on pre-annotated information and hence cannot really alleviate the oracle problem in testing. Besides, their method artificially manipulates objects, which may generate unreal images as test cases and thus lead to less meaningful testing results. Thirdly, existing methods have various requirements on the eligibility of source test cases, and hence cannot fully utilize the given images to perform testing. To tackle these issues, in this paper, we propose REIC to perform metamorphic testing for IC systems with some image-level reduction transformations like image cropping and stretching. Instead of relying on the pre-annotated information, REIC uses a localization method to align objects in the caption with corresponding objects in the image, and checks whether each object is correctly described or deleted in the caption after transformation. With the image-level reduction transformations, REIC does not artificially manipulate any objects and hence can avoid generating unreal follow-up images. Besides, it eliminates the requirement on the eligibility of source test cases in the metamorphic transformation process, as well as decreases the ambiguity and boosts the diversity among the follow-up test cases, which consequently enables testing to be performed on any test image and reveals more distinct valid violations. We employ REIC to test five popular IC systems. The results demonstrate that REIC can sufficiently leverage the provided test images to generate follow-up cases of good reality, and effectively detect a great number of distinct violations, without the need for any pre-annotated information.
翻译:图像描述技术被广泛用于以自然语言描述图像。近年来,研究者提出了一些图像描述系统测试方法。然而,这些方法仍依赖预标注信息,无法真正缓解测试中的预言难题。其次,现有方法人为操控对象,可能生成不真实的测试用例,导致测试结果意义降低。第三,现有方法对源测试用例的合格性有不同要求,无法充分利用给定图像进行测试。为解决上述问题,本文提出REIC方法,通过图像裁剪、拉伸等图像级缩减变换,对图像描述系统进行蜕变测试。REIC不依赖预标注信息,而是采用定位方法将描述中的对象与图像中对应对象对齐,并检查变换后描述中每个对象是否被正确描述或删除。通过图像级缩减变换,REIC无需人为操控任何对象,从而避免生成不真实的后续图像。此外,该方法消除了蜕变变换过程中对源测试用例合格性的要求,同时降低了歧义性并提升了后续测试用例的多样性,从而能够对任意测试图像执行测试,并发现更多不同的有效违规行为。我们使用REIC对五个主流图像描述系统进行测试。结果表明,REIC能够充分利用提供的测试图像生成高度真实的后续用例,无需任何预标注信息即可有效检测大量不同的违规行为。