MambaLoc: Efficient Camera Localisation via State Space Model

Location information is pivotal for the automation and intelligence of terminal devices and edge-cloud IoT systems, such as autonomous vehicles and augmented reality. However, achieving reliable positioning across diverse IoT applications remains challenging due to significant training costs and the necessity of densely collected data. To tackle these issues, we have innovatively applied the selective state space (SSM) model to visual localization, introducing a new model named MambaLoc. The proposed model demonstrates exceptional training efficiency by capitalizing on the SSM model's strengths in efficient feature extraction, rapid computation, and memory optimization, and it further ensures robustness in sparse data environments due to its parameter sparsity. Additionally, we propose the Global Information Selector (GIS), which leverages selective SSM to implicitly achieve the efficient global feature extraction capabilities of Non-local Neural Networks. This design leverages the computational efficiency of the SSM model alongside the Non-local Neural Networks' capacity to capture long-range dependencies with minimal layers. Consequently, the GIS enables effective global information capture while significantly accelerating convergence. Our extensive experimental validation using public indoor and outdoor datasets first demonstrates our model's effectiveness, followed by evidence of its versatility with various existing localization models. Our code and models are publicly available to support further research and development in this area.

翻译：定位信息对于终端设备及边缘云物联网系统（如自动驾驶车辆和增强现实）的自动化与智能化至关重要。然而，由于高昂的训练成本及密集数据采集的必要性，在不同物联网应用中实现可靠定位仍具挑战。为解决这些问题，我们创新性地将选择性状态空间（SSM）模型应用于视觉定位，提出了名为MambaLoc的新模型。该模型充分利用SSM模型在高效特征提取、快速计算和内存优化方面的优势，展现出卓越的训练效率；同时因其参数稀疏性，进一步确保了在稀疏数据环境下的鲁棒性。此外，我们提出了全局信息选择器（GIS），该模块利用选择性SSM隐式实现了非局部神经网络的高效全局特征提取能力。此设计结合了SSM模型的计算效率与非局部神经网络以少量层捕获长程依赖的能力。因此，GIS在显著加速收敛的同时实现了有效的全局信息捕获。我们使用公开室内外数据集进行的广泛实验验证，首先证明了本模型的有效性，随后通过其与多种现有定位模型的兼容性证明了其通用性。我们的代码与模型已公开，以支持该领域的进一步研究与开发。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日