The rapid expansion of internet of things (IoT) devices has created a pervasive ecosystem where encrypted wireless communications serve as the primary privacy and security protection mechanism. While encryption effectively protects message content, contextual information from packet metadata and statistics inadvertently expose device identities. Various studies have exploited raw packet statistics and their visual representations for device fingerprinting and identification. However, these approaches remain confined to the spatial domain with limited feature representation. Therefore, this paper presents CONTEX-T, a novel framework that exploits device-level information from encrypted traffic metadata using temporal and spectral representation. The experiments show that time-frequency analysis provides new and rich feature representation, revealing a complex and expanding threat landscape that would require robust countermeasures for IoT security management. CONTEX-T first transforms raw packet-length sequences into temporal and spectral representations and then utilizes vision transformers (ViTs) for device identification. We systematically evaluated multiple time-frequency representation techniques and transformer-based models across encrypted traffic samples from various IoT devices. CONTEX-T achieved device classification accuracy exceeding 99% while operating passively on observable contextual metadata. This demonstrates that temporal and spectral signatures persist under strong encryption, highlighting a critical attack surface for IoT network security and management.
翻译:暂无翻译