We present LibriWASN, a data set whose design follows closely the LibriCSS meeting recognition data set, with the marked difference that the data is recorded with devices that are randomly positioned on a meeting table and whose sampling clocks are not synchronized. Nine different devices, five smartphones with a single recording channel and four microphone arrays, are used to record a total of 29 channels. Other than that, the data set follows closely the LibriCSS design: the same LibriSpeech sentences are played back from eight loudspeakers arranged around a meeting table and the data is organized in subsets with different percentages of speech overlap. LibriWASN is meant as a test set for clock synchronization algorithms, meeting separation, diarization and transcription systems on ad-hoc wireless acoustic sensor networks. Due to its similarity to LibriCSS, meeting transcription systems developed for the former can readily be tested on LibriWASN. The data set is recorded in two different rooms and is complemented with ground-truth diarization information of who speaks when.
翻译:我们提出了LibriWASN数据集,其设计紧密遵循LibriCSS会议识别数据集的结构,显著区别在于该数据集使用随机放置在会议桌上且采样时钟未同步的设备进行录制。共使用九种不同设备(包括五部单通道智能手机和四个麦克风阵列)录制了29个通道。除此之外,该数据集严格遵循LibriCSS的设计方案:通过围绕会议桌布置的八个扬声器回放相同的LibriSpeech语句,并按不同语音重叠比例将数据组织为子集。LibriWASN旨在作为针对自组无线声学传感器网络中时钟同步算法、会议分离、说话人日志及转录系统的测试集。由于其与LibriCSS的相似性,为后者开发的会议转录系统可方便地在LibriWASN上进行测试。该数据集在两个不同房间内录制,并配备了谁在何时发言的真实说话人日志信息。