Floating-point square-root computation is a power- and delay-critical operation in edge-AI, signal-processing, and embedded systems. Conventional implementations typically rely on multipliers or iterative pipelines, resulting in increased hardware complexity, switching activity, and energy consumption. This work presents E2AFS, a lightweight and fully multiplier-free floating-point square-root architecture optimized for energy-efficient computation. By reducing logic depth and minimizing switching activity, the proposed design achieves substantial improvements in hardware efficiency and performance. FPGA implementation on an Artix-7 device demonstrates that E2AFS achieves the lowest dynamic power (7.63 mW), the shortest critical-path delay (4.639 ns), and the minimum power-delay product (35.39 pJ) compared to existing ESAS and CWAHA architectures. Error evaluation using multiple accuracy metrics, together with graphical analysis, shows that E2AFS closely approximates the exact square-root function with consistently low deviation. Application-level validation in Sobel edge detection and K-means color quantization further confirms its suitability for low-power real-time edge and embedded platforms.
翻译:浮点平方根运算是边缘AI、信号处理及嵌入式系统中的功耗与延迟关键型操作。传统实现方式通常依赖乘法器或迭代流水线,导致硬件复杂度、开关活动性及能耗增加。本文提出E2AFS——一种轻量级且完全免乘法器的浮点平方根架构,专为高能效计算优化。通过降低逻辑深度并最小化开关活动性,所提设计在硬件效率与性能方面实现显著提升。基于Artix-7器件的FPGA实现表明,与现有ESAS和CWAHA架构相比,E2AFS实现了最低动态功耗(7.63 mW)、最短关键路径延迟(4.639 ns)及最小功耗延迟积(35.39 pJ)。采用多种精度度量的误差评估及图形分析显示,E2AFS以持续的低偏差紧密逼近精确平方根函数。在Sobel边缘检测与K-means颜色量化中的应用级验证进一步证实了其适用于低功耗实时边缘及嵌入式平台。