Abstract
Sound event tagging is a process that adds texts or labels to sound segments based on their salient features and/or annotations. In the real world, since annotating cost is much expensive, tagged sound segments are limited, while untagged sound segments can be obtained easily and inexpensively. Thus, semi-automatic tagging becomes very important, which can assign labels to massive untagged sound segments according to a small number of manually annotated sound segments. Active learning is an effective technique to solve this problem, in which selected sound segments are manually tagged while other sound segments are automatically tagged. In this paper, a learnt dictionary based active learning method is proposed for environmental sound event tagging, which can significantly reduce the annotating cost in the process of semi-automatic tagging. The proposed method is based on a learnt dictionary, as dictionary learning is more adapt to sound feature extraction. Moreover, tagging accuracy and annotating cost are used to measure the performance of the proposed method. Experimental results demonstrate that the proposed method has higher tagging accuracy but requires much less annotating cost than other existing methods.
Original language | English |
---|---|
Pages (from-to) | 29493-29508 |
Number of pages | 16 |
Journal | Multimedia Tools and Applications |
Volume | 78 |
Issue number | 20 |
DOIs | |
Publication status | Published - 1 Oct 2019 |
Externally published | Yes |
Keywords
- Active learning
- Dictionary learning
- Internet of things
- k-medoids clustering
- Sound event tagging
- Sparse coding
ASJC Scopus subject areas
- Software
- Media Technology
- Hardware and Architecture
- Computer Networks and Communications