Relational recurrent neural networks for polyphonic sound event detection

Junbo Ma, Ruili Wang, Wanting Ji, Hao Zheng, En Zhu, Jianping Yin

Research output: Journal PublicationArticlepeer-review

10 Citations (Scopus)

Abstract

A smart environment is one of the application scenarios of the Internet of Things (IoT). In order to provide a ubiquitous smart environment for humans, a variety of technologies are developed. In a smart environment system, sound event detection is one of the fundamental technologies, which can automatically sense sound changes in the environment and detect sound events that cause changes. In this paper, we propose the use of Relational Recurrent Neural Network (RRNN) for polyphonic sound event detection, called RRNN-SED, which utilized the strength of RRNN in long-term temporal context extraction and relational reasoning across a polyphonic sound signal. Different from previous sound event detection methods, which rely heavily on convolutional neural networks or recurrent neural networks, the proposed RRNN-SED method can solve long-lasting and overlapping problems in polyphonic sound event detection. Specifically, since the historical information memorized inside RRNNs is capable of interacting with each other across a polyphonic sound signal, the proposed RRNN-SED method is effective and efficient in extracting temporal context information and reasoning the unique relational characteristic of the target sound events. Experimental results on two public datasets show that the proposed method achieved better sound event detection results in terms of segment-based F-score and segment-based error rate.

Original languageEnglish
Pages (from-to)29509-29527
Number of pages19
JournalMultimedia Tools and Applications
Volume78
Issue number20
DOIs
Publication statusPublished - 1 Oct 2019
Externally publishedYes

Keywords

  • deep neural networks
  • Internet of Things
  • recurrent neural networks
  • smart environment
  • sound event detection

ASJC Scopus subject areas

  • Software
  • Media Technology
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Relational recurrent neural networks for polyphonic sound event detection'. Together they form a unique fingerprint.

Cite this