Wireless ethical hacking relies heavily on skilled practitioners manually interpreting reconnaissance results and executing complex, time-sensitive sequences of commands to identify vulnerable targets, capture authentication handshakes, and assess password resilience; a process that is inherently labour-intensive, difficult to scale, and prone to subjective judgement and human error. To help address these limitations, we propose WiFiPenTester, an experimental, governed, and reproducible system for GenAI-enabled wireless ethical hacking. The system integrates large language models into the reconnaissance and decision-support phases of wireless security assessment, enabling intelligent target ranking, attack feasibility estimation, and strategy recommendation, while preserving strict human-in-the-loop control and budget-aware execution. We describe the system architecture, threat model, governance mechanisms, and prompt-engineering methodology, and empirical experiments conducted across multiple wireless environments. The results demonstrate that GenAI assistance improves target selection accuracy and overall assessment efficiency, while maintaining auditability and ethical safeguards. This indicates that WiFiPenTester is a meaningful step toward practical, safe, and scalable GenAI-assisted wireless penetration testing, while reinforcing the necessity of bounded autonomy, human oversight, and rigorous governance mechanisms when deploying GenAI in ethical hacking.
翻译:无线伦理黑客技术高度依赖熟练从业者手动解读侦察结果并执行复杂、时效性强的命令序列,以识别易受攻击的目标、捕获认证握手过程并评估密码强度;这一过程本质上是劳动密集型的,难以扩展,且易受主观判断和人为错误的影响。为帮助解决这些局限性,我们提出了WiFiPenTester,一个实验性的、受控的、可复现的生成式人工智能驱动的无线伦理黑客系统。该系统将大语言模型集成到无线安全评估的侦察与决策支持阶段,实现了智能目标排序、攻击可行性评估和策略推荐,同时保持了严格的人机协同控制与预算感知执行。我们描述了系统架构、威胁模型、治理机制、提示工程方法,以及在多种无线环境中进行的实证实验。结果表明,生成式人工智能的辅助提高了目标选择的准确性和整体评估效率,同时保持了可审计性和伦理保障。这表明WiFiPenTester是朝着实用、安全、可扩展的生成式人工智能辅助无线渗透测试迈出的重要一步,同时也强化了在伦理黑客领域部署生成式人工智能时,必须设定有限自主性、保持人工监督并建立严格治理机制的必要性。