URI redirections are integral to web management, supporting structural changes, SEO optimization, and security. However, their complexities affect usability, SEO performance, and digital preservation. This study analyzed 11 million unique redirecting URIs, following redirections up to 10 hops per URI, to uncover patterns and implications of redirection practices. Our findings revealed that 50% of the URIs terminated successfully, while 50% resulted in errors, including 0.06% exceeding 10 hops. Canonical redirects, such as HTTP to HTTPS transitions, were prevalent, reflecting adherence to SEO best practices. Non-canonical redirects, often involving domain or path changes, highlighted significant web migrations, rebranding, and security risks. Notable patterns included "sink" URIs, where multiple redirects converged, ranging from traffic consolidation by global websites to deliberate "Rickrolling." The study also identified 62,000 custom 404 URIs, almost half being soft 404s, which could compromise SEO and user experience. These findings underscore the critical role of URI redirects in shaping the web while exposing challenges such as outdated URIs, server instability, and improper error handling. This research offers a detailed analysis of URI redirection practices, providing insights into their prevalence, types, and outcomes. By examining a large dataset, we highlight inefficiencies in redirection chains and examine patterns such as the use of "sink" URIs and custom error pages. This information can help webmasters, researchers, and digital archivists improve web usability, optimize resource allocation, and safeguard valuable online content.
翻译:URI重定向是网络管理的重要组成部分,支持结构变更、SEO优化和安全防护。然而,其复杂性会影响可用性、SEO性能和数字保存。本研究分析了1100万个独立重定向URI,对每个URI跟踪多达10跳重定向,以揭示重定向实践的模式和影响。我们的研究结果显示,50%的URI成功终止,而50%导致错误,其中0.06%超过10跳。规范重定向(如HTTP到HTTPS转换)普遍存在,反映了对SEO最佳实践的遵循。非规范重定向通常涉及域名或路径变更,突显了重大的网络迁移、品牌重塑和安全风险。值得注意的模式包括“汇聚”URI,即多个重定向汇聚于一点,范围从全球网站的流量整合到故意的“Rickrolling”恶作剧。研究还识别出62,000个自定义404 URI,其中近半数为软404,可能损害SEO和用户体验。这些发现强调了URI重定向在塑造网络中的关键作用,同时揭示了过时URI、服务器不稳定和错误处理不当等挑战。本研究提供了URI重定向实践的详细分析,揭示了其普遍性、类型和结果。通过检查大规模数据集,我们突出了重定向链中的低效问题,并探讨了诸如“汇聚”URI和自定义错误页面的使用模式。这些信息可帮助网站管理员、研究人员和数字档案管理员提升网络可用性、优化资源分配并保护有价值的在线内容。