Generative Artificial Intelligence (AI) is a cutting-edge technology capable of producing text, images, and various media content leveraging generative models and user prompts. Between 2022 and 2023, generative AI surged in popularity with a plethora of applications spanning from AI-powered movies to chatbots. In this paper, we delve into the potential of generative AI within the realm of the World Wide Web, specifically focusing on image generation. Web developers already harness generative AI to help crafting text and images, while Web browsers might use it in the future to locally generate images for tasks like repairing broken webpages, conserving bandwidth, and enhancing privacy. To explore this research area, we have developed WebDiffusion, a tool that allows to simulate a Web powered by stable diffusion, a popular text-to-image model, from both a client and server perspective. WebDiffusion further supports crowdsourcing of user opinions, which we use to evaluate the quality and accuracy of 409 AI-generated images sourced from 60 webpages. Our findings suggest that generative AI is already capable of producing pertinent and high-quality Web images, even without requiring Web designers to manually input prompts, just by leveraging contextual information available within the webpages. However, we acknowledge that direct in-browser image generation remains a challenge, as only highly powerful GPUs, such as the A40 and A100, can (partially) compete with classic image downloads. Nevertheless, this approach could be valuable for a subset of the images, for example when fixing broken webpages or handling highly private content.
翻译:生成式人工智能(AI)是一种前沿技术,能够利用生成模型和用户提示生成文本、图像及多种媒体内容。2022年至2023年间,生成式AI迅速普及,应用涵盖从AI驱动的电影到聊天机器人等多个领域。本文深入探讨生成式AI在万维网领域的潜力,特别聚焦于图像生成。Web开发者已利用生成式AI辅助创作文本和图像,而未来浏览器可能本地使用该技术生成图像,用于修复损坏网页、节省带宽及增强隐私。为探索这一研究方向,我们开发了WebDiffusion工具,该工具可从客户端和服务器端模拟由稳定扩散(一种流行的文本到图像模型)驱动的网络。WebDiffusion还支持用户意见的众包,我们借此评估了来自60个网页的409张AI生成图像的质量与准确性。研究结果表明,生成式AI已能生成相关且高质量的网页图像,即使无需Web设计师手动输入提示,仅利用网页中的上下文信息即可实现。然而,我们承认在浏览器内直接生成图像仍具挑战,仅高性能GPU(如A40和A100)能(部分)与经典图像下载相媲美。尽管如此,该方法对部分图像场景(如修复损坏网页或处理高度隐私内容)仍具有应用价值。