Deep Audio Analyzer is an open source speech framework that aims to simplify the research and the development process of neural speech processing pipelines, allowing users to conceive, compare and share results in a fast and reproducible way. This paper describes the core architecture designed to support several tasks of common interest in the audio forensics field, showing possibility of creating new tasks thus customizing the framework. By means of Deep Audio Analyzer, forensics examiners (i.e. from Law Enforcement Agencies) and researchers will be able to visualize audio features, easily evaluate performances on pretrained models, to create, export and share new audio analysis workflows by combining deep neural network models with few clicks. One of the advantages of this tool is to speed up research and practical experimentation, in the field of audio forensics analysis thus also improving experimental reproducibility by exporting and sharing pipelines. All features are developed in modules accessible by the user through a Graphic User Interface. Index Terms: Speech Processing, Deep Learning Audio, Deep Learning Audio Pipeline creation, Audio Forensics.
翻译:深度音频分析器是一个开源语音框架,旨在简化神经语音处理流水线的研究与开发过程,使用户能够以快速且可重复的方式构思、比较和共享结果。本文描述了为支持音频取证领域中若干常见任务而设计的核心架构,展示了创建新任务从而定制框架的可能性。通过深度音频分析器,取证审查员(例如来自执法机构的人员)和研究人员将能够可视化音频特征,轻松评估预训练模型的性能,并通过几次点击组合深度神经网络模型来创建、导出和共享新的音频分析工作流。该工具的优势之一在于加速音频取证分析领域的实际研究和实验,同时通过导出和共享流水线提升实验可重复性。所有功能均以模块化形式开发,用户可通过图形用户界面访问。索引术语:语音处理、深度学习音频、深度学习音频流水线创建、音频取证。