The broadcasting industry is increasingly adopting IP techniques, revolutionising both live and pre-recorded content production, from news gathering to live music events. IP broadcasting allows for the transport of audio and video signals in an easily configurable way, aligning with modern networking techniques. This shift towards an IP workflow allows for much greater flexibility, not only in routing signals but with the integration of tools using standard web development techniques. One possible tool could include the use of live audio tagging, which has a number of uses in the production of content. These include from automated closed captioning to identifying unwanted sound events within a scene. In this paper, we describe the process of containerising an audio tagging model into a microservice, a small segregated code module that can be integrated into a multitude of different network setups. The goal is to develop a modular, accessible, and flexible tool capable of seamless deployment into broadcasting workflows of all sizes, from small productions to large corporations. Challenges surrounding latency of the selected audio tagging model and its effect on the usefulness of the end product are discussed.
翻译:广播行业正日益采用IP技术,从新闻采集到现场音乐活动,彻底改变了直播和预录内容的制作方式。IP广播能以易于配置的方式传输音视频信号,与现代网络技术相契合。这种向IP工作流程的转变不仅带来了信号路由的更大灵活性,还使得利用标准Web开发技术集成各类工具成为可能。其中一种潜在工具是实时音频标签技术,其在内容制作中具有多种用途,包括自动生成隐藏字幕和识别场景中的异常声音事件等。本文描述了将音频标签模型容器化为微服务的过程——微服务作为一种小型隔离代码模块,可集成到多种不同的网络架构中。我们的目标是开发一种模块化、易用且灵活的工具,能够无缝部署到从小型制作到大型企业的各种规模广播工作流程中。文中还探讨了所选音频标签模型的延迟问题及其对最终产品实用性的影响。