Smart cities increasingly depend on dense edge, IoT, and vehicular networks to deliver critical urban services, including traffic control, connected mobility, infrastructure monitoring, and energy management. In this ecosystem, the Internet of Vehicles (IoV) is central to intelligent transportation, enabling continuous communication among vehicles, roadside infrastructure, and cloud-edge platforms. This connectivity, however, also enlarges the attack surface and exposes smart city and vehicular systems to evolving cyber threats that can compromise safety, privacy, data integrity, and service continuity. Conventional static defenses are often inadequate because they cannot autonomously adapt to changing attack behaviors or multi-stage intrusion patterns. This paper proposes QIRL, a Quantum-Inspired Reinforcement Learning framework built on a lightweight Deep Q-Network architecture for next-generation autonomous cyber defense. QIRL combines amplitude-phase quantum state encoding, rotation-gate-based exploration, and quantum interference reward augmentation within a cost-sensitive Markov Decision Process formulation. It further addresses class imbalance through training-only SMOTE balancing and asymmetric cost-sensitive reward shaping, while sequential MDP modeling captures temporal dependencies in multi-stage attack campaigns. The framework is evaluated on CICIDS2017 and UNSW-NB15. QIRL achieves accuracies of 97.89\% and 91.04\%, F1-scores of 95.22\% and 91.66\%, AUC-ROC values of 0.9945 and 0.9713, and True Skill Statistics of 0.9443 and 0.8244, respectively. It also attains ultra-low inference latencies of 32.5 and 45.7 microseconds per sample, corresponding to 67.77 times and 51.77 times speedups over ensemble baselines. These results show that QIRL offers a lightweight, latency-aware, and adaptive defense for smart city and IoV infrastructures.
翻译:智能城市日益依赖于密集的边缘网络、物联网及车载网络来提供关键城市服务,包括交通控制、互联出行、基础设施监测与能源管理。在此生态系统中,车联网作为智能交通的核心,支撑着车辆、路侧基础设施及云边平台之间的持续通信。然而,这种连通性也扩大了攻击面,使智能城市与车载系统面临不断演变的网络威胁——这些威胁可能危及安全、隐私、数据完整性与服务连续性。传统静态防护通常难以胜任,因其无法自主适应变化的攻击行为或多阶段入侵模式。本文提出QIRL——一种基于轻量化深度Q网络架构的量子启发式强化学习框架,用于下一代自主网络防御。QIRL在成本敏感型马尔可夫决策过程框架内,融合了振幅-相位量子态编码、旋转门基探索及量子干涉奖励增强技术。该框架通过训练阶段SMOTE平衡与不对称成本敏感奖励塑造处理类别不平衡问题,并利用序列化MDP建模捕获多阶段攻击战役的时间依赖性。我们在CICIDS2017与UNSW-NB15数据集上对框架进行了评估。QIRL在两者上分别达到97.89%和91.04%的准确率、95.22%和91.66%的F1分数、0.9945和0.9713的AUC-ROC值,以及0.9443和0.8244的真技能统计量。其每个样本的推理延迟分别低至32.5微秒和45.7微秒,相较集成基线实现了67.77倍与51.77倍的加速。这些结果表明,QIRL为智能城市与车联网基础设施提供了一种轻量化、延迟感知且具备自适应能力的防御方案。