Anomaly3D: Video anomaly detection based on 3D-normality clusters

Mujtaba Asad; Jie Yang; Enmei Tu; Liming Chen; Xiangjian He

doi:10.1016/j.jvcir.2021.103047

Anomaly3D: Video anomaly detection based on 3D-normality clusters

Mujtaba Asad, Jie Yang, Enmei Tu, Liming Chen, Xiangjian He

Research output: Journal Publication › Article › peer-review

30 Citations (Scopus)

Abstract

Abnormal behavior detection in surveillance videos is necessary for public monitoring and safety. In human-based surveillance systems, it requires continuous human attention and observation, which is a difficult task. The autonomous detection of such events is of essential significance. However, due to the scarcity of labeled data and the low occurrence probability of these events, abnormal event detection is a challenging vision problem. In this paper, we introduce a novel two-stage architecture for detecting anomalous behavior in videos. In the first stage, we propose a 3D Convolutional Autoencoder (3D-CAE) architecture to extract spatio-temporal features from normal event training videos. In 3D-CAE, the encoder and decoder architectures are based on 3D convolutions, which can learn both appearance and the motion features effectively in an unsupervised manner. In the second stage, we group the 3D spatio-temporal features into different normality clusters, and then remove the sparse clusters to represent a stronger pattern of normality. From these clusters, one-class SVM classifier is used to distinguish between normal and abnormal events based on the normality scores. Experimental results on four different benchmark datasets show significant performance improvement compared to state-of-the-art approaches while providing results in real-time.

Original language	English
Article number	103047
Journal	Journal of Visual Communication and Image Representation
Volume	75
DOIs	https://doi.org/10.1016/j.jvcir.2021.103047
Publication status	Published - Feb 2021
Externally published	Yes

Keywords

3D-CAE
Anomaly detection
Autonomous video surveillance
Spatiotemporal latent features
Video analysis

ASJC Scopus subject areas

Signal Processing
Media Technology
Computer Vision and Pattern Recognition
Electrical and Electronic Engineering

Access to Document

10.1016/j.jvcir.2021.103047

Cite this

@article{7641cae0eae84848a7b104d816709873,

title = "Anomaly3D: Video anomaly detection based on 3D-normality clusters",

abstract = "Abnormal behavior detection in surveillance videos is necessary for public monitoring and safety. In human-based surveillance systems, it requires continuous human attention and observation, which is a difficult task. The autonomous detection of such events is of essential significance. However, due to the scarcity of labeled data and the low occurrence probability of these events, abnormal event detection is a challenging vision problem. In this paper, we introduce a novel two-stage architecture for detecting anomalous behavior in videos. In the first stage, we propose a 3D Convolutional Autoencoder (3D-CAE) architecture to extract spatio-temporal features from normal event training videos. In 3D-CAE, the encoder and decoder architectures are based on 3D convolutions, which can learn both appearance and the motion features effectively in an unsupervised manner. In the second stage, we group the 3D spatio-temporal features into different normality clusters, and then remove the sparse clusters to represent a stronger pattern of normality. From these clusters, one-class SVM classifier is used to distinguish between normal and abnormal events based on the normality scores. Experimental results on four different benchmark datasets show significant performance improvement compared to state-of-the-art approaches while providing results in real-time.",

keywords = "3D-CAE, Anomaly detection, Autonomous video surveillance, Spatiotemporal latent features, Video analysis",

author = "Mujtaba Asad and Jie Yang and Enmei Tu and Liming Chen and Xiangjian He",

note = "Publisher Copyright: {\textcopyright} 2021",

year = "2021",

month = feb,

doi = "10.1016/j.jvcir.2021.103047",

language = "English",

volume = "75",

journal = "Journal of Visual Communication and Image Representation",

issn = "1047-3203",

publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - Anomaly3D

T2 - Video anomaly detection based on 3D-normality clusters

AU - Asad, Mujtaba

AU - Yang, Jie

AU - Tu, Enmei

AU - Chen, Liming

AU - He, Xiangjian

PY - 2021/2

Y1 - 2021/2

N2 - Abnormal behavior detection in surveillance videos is necessary for public monitoring and safety. In human-based surveillance systems, it requires continuous human attention and observation, which is a difficult task. The autonomous detection of such events is of essential significance. However, due to the scarcity of labeled data and the low occurrence probability of these events, abnormal event detection is a challenging vision problem. In this paper, we introduce a novel two-stage architecture for detecting anomalous behavior in videos. In the first stage, we propose a 3D Convolutional Autoencoder (3D-CAE) architecture to extract spatio-temporal features from normal event training videos. In 3D-CAE, the encoder and decoder architectures are based on 3D convolutions, which can learn both appearance and the motion features effectively in an unsupervised manner. In the second stage, we group the 3D spatio-temporal features into different normality clusters, and then remove the sparse clusters to represent a stronger pattern of normality. From these clusters, one-class SVM classifier is used to distinguish between normal and abnormal events based on the normality scores. Experimental results on four different benchmark datasets show significant performance improvement compared to state-of-the-art approaches while providing results in real-time.

AB - Abnormal behavior detection in surveillance videos is necessary for public monitoring and safety. In human-based surveillance systems, it requires continuous human attention and observation, which is a difficult task. The autonomous detection of such events is of essential significance. However, due to the scarcity of labeled data and the low occurrence probability of these events, abnormal event detection is a challenging vision problem. In this paper, we introduce a novel two-stage architecture for detecting anomalous behavior in videos. In the first stage, we propose a 3D Convolutional Autoencoder (3D-CAE) architecture to extract spatio-temporal features from normal event training videos. In 3D-CAE, the encoder and decoder architectures are based on 3D convolutions, which can learn both appearance and the motion features effectively in an unsupervised manner. In the second stage, we group the 3D spatio-temporal features into different normality clusters, and then remove the sparse clusters to represent a stronger pattern of normality. From these clusters, one-class SVM classifier is used to distinguish between normal and abnormal events based on the normality scores. Experimental results on four different benchmark datasets show significant performance improvement compared to state-of-the-art approaches while providing results in real-time.

KW - 3D-CAE

KW - Anomaly detection

KW - Autonomous video surveillance

KW - Spatiotemporal latent features

KW - Video analysis

UR - http://www.scopus.com/inward/record.url?scp=85100881383&partnerID=8YFLogxK

U2 - 10.1016/j.jvcir.2021.103047

DO - 10.1016/j.jvcir.2021.103047

M3 - Article

AN - SCOPUS:85100881383

SN - 1047-3203

VL - 75

JO - Journal of Visual Communication and Image Representation

JF - Journal of Visual Communication and Image Representation

M1 - 103047

ER -

Anomaly3D: Video anomaly detection based on 3D-normality clusters

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this