A Paradigm for Generalized Multi-Level Priority Encoders

Priority encoders are typically considered expensive hardware components in terms of complexity, especially at high bit precisions or input lengths (e.g., above 512 bits). However, if the complexity can be reduced, priority encoders can feasibly accelerate a variety of key applications, such as high-precision integer arithmetic and content-addressable memory. We propose a new paradigm for constructing priority encoders by generalizing the previously proposed two-level priority encoder structure. We extend this concept to three and four levels using two techniques -- cascading and composition -- and discuss further generalization. We then analyze the complexity and delay of new and existing priority encoder designs as a function of input length, for both FPGA and ASIC implementation technologies. In particular, we compare the multi-level structure to the traditional single-level priority encoder structure, a tree-based design, a recursive design, and the two-level structure. We find that the two-level architecture provides balanced performance -- reducing complexity by around half, but at the cost of a corresponding increase in delay. Additional levels have diminishing returns, highlighting a tradeoff between complexity and delay. Meanwhile, the tree and recursive designs are generally faster, but are more complex than the two-level and multi-level structures. We explore several characteristics and patterns of the designs across a wide range of input lengths. We then provide recommendations on which architecture to use for a given input length and implementation technology, based on which design factors -- such as complexity or delay -- are most important to the hardware designer. With this overview and analysis of various priority encoder architectures, we provide a priority encoder toolkit to assist hardware designers in creating the most optimal design.

翻译：优先级编码器通常被认为是复杂度较高的硬件组件，尤其是在高比特精度或输入长度较大时（例如超过512位）。然而，如果能够降低其复杂度，优先级编码器可以切实加速多种关键应用，例如高精度整数运算和内容可寻址存储器。本文提出一种构建优先级编码器的新范式，通过推广先前提出的两级优先级编码器结构来实现。我们利用两种技术——级联与组合——将这一概念扩展至三级和四级，并讨论了进一步推广的可能性。随后，我们针对FPGA和ASIC两种实现技术，分析了新型及现有优先级编码器设计的复杂度与延迟随输入长度的变化关系。特别地，我们将多级结构与传统的单级优先级编码器结构、基于树状的设计、递归设计以及两级结构进行了比较。研究发现，两级架构在性能上较为均衡——复杂度降低约一半，但代价是延迟相应增加。增加更多级别带来的收益递减，突显了复杂度与延迟之间的权衡。同时，树状和递归设计通常速度更快，但比两级及多级结构更为复杂。我们探讨了不同输入长度下各种设计的若干特性与模式。在此基础上，根据硬件设计者最关注的设计因素（如复杂度或延迟），针对给定的输入长度和实现技术提出了架构选用建议。通过对多种优先级编码器架构的综述与分析，我们提供了一个优先级编码器工具包，以协助硬件设计者创建最优的设计方案。