The ubiquity of variable-length integers in data storage and communication necessitates efficient decoding techniques. In this paper, we present SFVInt, a simple and fast approach to decode the prevalent Little Endian Base-128 (LEB128) varints. Our approach effectively utilizes the Bit Manipulation Instruction Set 2 (BMI2) in modern Intel and AMD processors, achieving significant performance improvement while maintaining simplicity and avoiding overengineering. SFVInt, with its generic design, effectively processes both 32-bit and 64-bit unsigned integers using a unified code template, marking a significant leap forward in varint decoding efficiency. We thoroughly evaluate SFVInt's performance across various datasets and scenarios, demonstrating that it achieves up to a 2x increase in decoding speed when compared to varint decoding methods used in established frameworks like Facebook Folly and Google Protobuf.
翻译:可变长整数在数据存储与通信中的广泛应用,亟需高效的解码技术。本文提出SFVInt——一种针对主流小端基128(LEB128)可变长整数的简洁高效解码方案。该方法充分利用现代Intel与AMD处理器中的位操作指令集2(BMI2),在避免过度工程化的前提下,显著提升解码性能并保持实现简洁性。通过通用化设计,SFVInt采用统一代码模板高效处理32位与64位无符号整数,标志着可变长整数解码效率的重大飞跃。我们在多种数据集与场景下对SFVInt进行全面评估,结果表明,与Facebook Folly、Google Protobuf等成熟框架使用的可变长整数解码方法相比,其解码速度最高可提升2倍。