MoWLD: a robust motion image descriptor for violence detection

Tao Zhang; Wenjing Jia; Baoqing Yang; Jie Yang; Xiangjian He; Zhonglong Zheng

doi:10.1007/s11042-015-3133-0

MoWLD: a robust motion image descriptor for violence detection

Tao Zhang, Wenjing Jia, Baoqing Yang, Jie Yang, Xiangjian He, Zhonglong Zheng

Research output: Journal Publication › Article › peer-review

82 Citations (Scopus)

Abstract

Automatic violence detection from video is a hot topic for many video surveillance applications. However, there has been little success in designing an algorithm that can detect violence in surveillance videos with high performance. Existing methods typically apply the Bag-of-Words (BoW) model on local spatiotemporal descriptors. However, traditional spatiotemporal features are not discriminative enough, and also the BoW model roughly assigns each feature vector to only one visual word and therefore ignores the spatial relationships among the features. To tackle these problems, in this paper we propose a novel Motion Weber Local Descriptor (MoWLD) in the spirit of the well-known WLD and make it a powerful and robust descriptor for motion images. We extend the WLD spatial descriptions by adding a temporal component to the appearance descriptor, which implicitly captures local motion information as well as low-level image appear information. To eliminate redundant and irrelevant features, the non-parametric Kernel Density Estimation (KDE) is employed on the MoWLD descriptor. In order to obtain more discriminative features, we adopt the sparse coding and max pooling scheme to further process the selected MoWLDs. Experimental results on three benchmark datasets have demonstrated the superiority of the proposed approach over the state-of-the-arts.

Original language	English
Pages (from-to)	1419-1438
Number of pages	20
Journal	Multimedia Tools and Applications
Volume	76
Issue number	1
DOIs	https://doi.org/10.1007/s11042-015-3133-0
Publication status	Published - 1 Jan 2017
Externally published	Yes

Keywords

Kernel density estimation (KDE)
Max pooling
Motion weber local descriptors (MoWLD)
Sparse coding
Surveillance systems
Violence detection

ASJC Scopus subject areas

Software
Media Technology
Hardware and Architecture
Computer Networks and Communications

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1007/s11042-015-3133-0

Cite this

@article{152cea04eba04d6eaeb7026d9257f9fa,

title = "MoWLD: a robust motion image descriptor for violence detection",

abstract = "Automatic violence detection from video is a hot topic for many video surveillance applications. However, there has been little success in designing an algorithm that can detect violence in surveillance videos with high performance. Existing methods typically apply the Bag-of-Words (BoW) model on local spatiotemporal descriptors. However, traditional spatiotemporal features are not discriminative enough, and also the BoW model roughly assigns each feature vector to only one visual word and therefore ignores the spatial relationships among the features. To tackle these problems, in this paper we propose a novel Motion Weber Local Descriptor (MoWLD) in the spirit of the well-known WLD and make it a powerful and robust descriptor for motion images. We extend the WLD spatial descriptions by adding a temporal component to the appearance descriptor, which implicitly captures local motion information as well as low-level image appear information. To eliminate redundant and irrelevant features, the non-parametric Kernel Density Estimation (KDE) is employed on the MoWLD descriptor. In order to obtain more discriminative features, we adopt the sparse coding and max pooling scheme to further process the selected MoWLDs. Experimental results on three benchmark datasets have demonstrated the superiority of the proposed approach over the state-of-the-arts.",

keywords = "Kernel density estimation (KDE), Max pooling, Motion weber local descriptors (MoWLD), Sparse coding, Surveillance systems, Violence detection",

author = "Tao Zhang and Wenjing Jia and Baoqing Yang and Jie Yang and Xiangjian He and Zhonglong Zheng",

note = "Publisher Copyright: {\textcopyright} 2015, Springer Science+Business Media New York.",

year = "2017",

month = jan,

day = "1",

doi = "10.1007/s11042-015-3133-0",

language = "English",

volume = "76",

pages = "1419--1438",

journal = "Multimedia Tools and Applications",

issn = "1380-7501",

publisher = "Springer",

number = "1",

}

TY - JOUR

T1 - MoWLD

T2 - a robust motion image descriptor for violence detection

AU - Zhang, Tao

AU - Jia, Wenjing

AU - Yang, Baoqing

AU - Yang, Jie

AU - He, Xiangjian

AU - Zheng, Zhonglong

PY - 2017/1/1

Y1 - 2017/1/1

N2 - Automatic violence detection from video is a hot topic for many video surveillance applications. However, there has been little success in designing an algorithm that can detect violence in surveillance videos with high performance. Existing methods typically apply the Bag-of-Words (BoW) model on local spatiotemporal descriptors. However, traditional spatiotemporal features are not discriminative enough, and also the BoW model roughly assigns each feature vector to only one visual word and therefore ignores the spatial relationships among the features. To tackle these problems, in this paper we propose a novel Motion Weber Local Descriptor (MoWLD) in the spirit of the well-known WLD and make it a powerful and robust descriptor for motion images. We extend the WLD spatial descriptions by adding a temporal component to the appearance descriptor, which implicitly captures local motion information as well as low-level image appear information. To eliminate redundant and irrelevant features, the non-parametric Kernel Density Estimation (KDE) is employed on the MoWLD descriptor. In order to obtain more discriminative features, we adopt the sparse coding and max pooling scheme to further process the selected MoWLDs. Experimental results on three benchmark datasets have demonstrated the superiority of the proposed approach over the state-of-the-arts.

AB - Automatic violence detection from video is a hot topic for many video surveillance applications. However, there has been little success in designing an algorithm that can detect violence in surveillance videos with high performance. Existing methods typically apply the Bag-of-Words (BoW) model on local spatiotemporal descriptors. However, traditional spatiotemporal features are not discriminative enough, and also the BoW model roughly assigns each feature vector to only one visual word and therefore ignores the spatial relationships among the features. To tackle these problems, in this paper we propose a novel Motion Weber Local Descriptor (MoWLD) in the spirit of the well-known WLD and make it a powerful and robust descriptor for motion images. We extend the WLD spatial descriptions by adding a temporal component to the appearance descriptor, which implicitly captures local motion information as well as low-level image appear information. To eliminate redundant and irrelevant features, the non-parametric Kernel Density Estimation (KDE) is employed on the MoWLD descriptor. In order to obtain more discriminative features, we adopt the sparse coding and max pooling scheme to further process the selected MoWLDs. Experimental results on three benchmark datasets have demonstrated the superiority of the proposed approach over the state-of-the-arts.

KW - Kernel density estimation (KDE)

KW - Max pooling

KW - Motion weber local descriptors (MoWLD)

KW - Sparse coding

KW - Surveillance systems

KW - Violence detection

UR - http://www.scopus.com/inward/record.url?scp=84949674334&partnerID=8YFLogxK

U2 - 10.1007/s11042-015-3133-0

DO - 10.1007/s11042-015-3133-0

M3 - Article

AN - SCOPUS:84949674334

SN - 1380-7501

VL - 76

SP - 1419

EP - 1438

JO - Multimedia Tools and Applications

JF - Multimedia Tools and Applications

IS - 1

ER -

MoWLD: a robust motion image descriptor for violence detection

Abstract

Keywords

ASJC Scopus subject areas

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this