Domain squatting poses a significant threat to Internet security, with attackers employing increasingly sophisticated techniques. This study introduces DomainLynx, an innovative compound AI system leveraging Large Language Models (LLMs) for enhanced domain squatting detection. Unlike existing methods focusing on predefined patterns for top-ranked domains, DomainLynx excels in identifying novel squatting techniques and protecting less prominent brands. The system's architecture integrates advanced data processing, intelligent domain pairing, and LLM-powered threat assessment. Crucially, DomainLynx incorporates specialized components that mitigate LLM hallucinations, ensuring reliable and context-aware detection. This approach enables efficient analysis of vast security data from diverse sources, including Certificate Transparency logs, Passive DNS records, and zone files. Evaluated on a curated dataset of 1,649 squatting domains, DomainLynx achieved 94.7\% accuracy using Llama-3-70B. In a month-long real-world test, it detected 34,359 squatting domains from 2.09 million new domains, outperforming baseline methods by 2.5 times. This research advances Internet security by providing a versatile, accurate, and adaptable tool for combating evolving domain squatting threats. DomainLynx's approach paves the way for more robust, AI-driven cybersecurity solutions, enhancing protection for a broader range of online entities and contributing to a safer digital ecosystem.
翻译:域名抢注对互联网安全构成重大威胁,攻击者采用日益复杂的技术手段。本研究提出DomainLynx,一种创新性复合人工智能系统,其利用大语言模型(LLMs)增强域名抢注检测能力。与现有方法主要针对顶级域名的预定义模式不同,DomainLynx擅长识别新型抢注技术并保护知名度较低的品牌。该系统架构集成了先进的数据处理、智能域名配对以及基于LLM的威胁评估。关键之处在于,DomainLynx包含专门组件以缓解LLM幻觉问题,确保检测结果可靠且具备上下文感知能力。该方法能够高效分析来自多种来源的海量安全数据,包括证书透明度日志、被动DNS记录和区域文件。在包含1,649个抢注域名的精选数据集上评估,DomainLynx使用Llama-3-70B实现了94.7%的准确率。在一个月的实际环境测试中,其从209万个新域名中检测出34,359个抢注域名,性能超越基线方法2.5倍。本研究通过提供一个多功能、准确且适应性强的工具来应对不断演变的域名抢注威胁,从而推动互联网安全发展。DomainLynx的方法为构建更健壮、AI驱动的网络安全解决方案开辟了道路,增强了对更广泛在线实体的保护,并为构建更安全的数字生态系统做出贡献。