Violence detection based on spatio-temporal feature and fisher vector

Huangkai Cai; He Jiang; Xiaolin Huang; Jie Yang; Xiangjian He

doi:10.1007/978-3-030-03398-9_16

Violence detection based on spatio-temporal feature and fisher vector

Huangkai Cai, He Jiang, Xiaolin Huang, Jie Yang, Xiangjian He

Research output: Chapter in Book/Conference proceeding › Conference contribution › peer-review

4 Citations (Scopus)

Abstract

A novel framework based on local spatio-temporal features and a Bag-of-Words (BoW) model is proposed for violence detection. The framework utilizes Dense Trajectories (DT) and MPEG flow video descriptor (MF) as feature descriptors and employs Fisher Vector (FV) in feature coding. DT and MF algorithms are more descriptive and robust, because they are combinations of various feature descriptors, which describe trajectory shape, appearance, motion and motion boundary, respectively. FV is applied to transform low level features to high level features. FV method preserves much information, because not only the affiliations of descriptors are found in the codebook, but also the first and second order statistics are used to represent videos. Some tricks, that PCA, K-means++ and codebook size, are used to improve the final performance of video classification. In comprehensive consideration of accuracy, speed and application scenarios, the proposed method for violence detection is analysed. Experimental results show that the proposed approach outperforms the state-of-the-art approaches for violence detection in both crowd scenes and non-crowd scenes.

Original language	English
Title of host publication	Pattern Recognition and Computer Vision - First Chinese Conference, PRCV 2018, Proceedings
Editors	Jian-Huang Lai, Hongbin Zha, Jie Zhou, Cheng-Lin Liu, Tieniu Tan, Nanning Zheng, Xilin Chen
Publisher	Springer Verlag
Pages	180-190
Number of pages	11
ISBN (Print)	9783030033972
DOIs	https://doi.org/10.1007/978-3-030-03398-9_16
Publication status	Published - 2018
Externally published	Yes
Event	1st Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2018 - Guangzhou, China Duration: 23 Nov 2018 → 26 Nov 2018

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	11256 LNCS
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	1st Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2018
Country/Territory	China
City	Guangzhou
Period	23/11/18 → 26/11/18

Keywords

Dense Trajectories
Fisher Vector
Linear support vector machine
MPEG flow video descriptor
Violence detection

ASJC Scopus subject areas

Theoretical Computer Science
General Computer Science

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1007/978-3-030-03398-9_16

Cite this

Cai, H., Jiang, H., Huang, X., Yang, J., & He, X. (2018). Violence detection based on spatio-temporal feature and fisher vector. In J.-H. Lai, H. Zha, J. Zhou, C.-L. Liu, T. Tan, N. Zheng, & X. Chen (Eds.), Pattern Recognition and Computer Vision - First Chinese Conference, PRCV 2018, Proceedings (pp. 180-190). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11256 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-030-03398-9_16

Cai, Huangkai ; Jiang, He ; Huang, Xiaolin et al. / Violence detection based on spatio-temporal feature and fisher vector. Pattern Recognition and Computer Vision - First Chinese Conference, PRCV 2018, Proceedings. editor / Jian-Huang Lai ; Hongbin Zha ; Jie Zhou ; Cheng-Lin Liu ; Tieniu Tan ; Nanning Zheng ; Xilin Chen. Springer Verlag, 2018. pp. 180-190 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{7629478cff9f4bd28255ac841eb7b657,

title = "Violence detection based on spatio-temporal feature and fisher vector",

abstract = "A novel framework based on local spatio-temporal features and a Bag-of-Words (BoW) model is proposed for violence detection. The framework utilizes Dense Trajectories (DT) and MPEG flow video descriptor (MF) as feature descriptors and employs Fisher Vector (FV) in feature coding. DT and MF algorithms are more descriptive and robust, because they are combinations of various feature descriptors, which describe trajectory shape, appearance, motion and motion boundary, respectively. FV is applied to transform low level features to high level features. FV method preserves much information, because not only the affiliations of descriptors are found in the codebook, but also the first and second order statistics are used to represent videos. Some tricks, that PCA, K-means++ and codebook size, are used to improve the final performance of video classification. In comprehensive consideration of accuracy, speed and application scenarios, the proposed method for violence detection is analysed. Experimental results show that the proposed approach outperforms the state-of-the-art approaches for violence detection in both crowd scenes and non-crowd scenes.",

keywords = "Dense Trajectories, Fisher Vector, Linear support vector machine, MPEG flow video descriptor, Violence detection",

author = "Huangkai Cai and He Jiang and Xiaolin Huang and Jie Yang and Xiangjian He",

note = "Publisher Copyright: {\textcopyright} Springer Nature Switzerland AG 2018.; 1st Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2018 ; Conference date: 23-11-2018 Through 26-11-2018",

year = "2018",

doi = "10.1007/978-3-030-03398-9\_16",

language = "English",

isbn = "9783030033972",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Verlag",

pages = "180--190",

editor = "Jian-Huang Lai and Hongbin Zha and Jie Zhou and Cheng-Lin Liu and Tieniu Tan and Nanning Zheng and Xilin Chen",

booktitle = "Pattern Recognition and Computer Vision - First Chinese Conference, PRCV 2018, Proceedings",

address = "Germany",

}

Cai, H, Jiang, H, Huang, X, Yang, J & He, X 2018, Violence detection based on spatio-temporal feature and fisher vector. in J-H Lai, H Zha, J Zhou, C-L Liu, T Tan, N Zheng & X Chen (eds), Pattern Recognition and Computer Vision - First Chinese Conference, PRCV 2018, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11256 LNCS, Springer Verlag, pp. 180-190, 1st Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2018, Guangzhou, China, 23/11/18. https://doi.org/10.1007/978-3-030-03398-9_16

Violence detection based on spatio-temporal feature and fisher vector. / Cai, Huangkai; Jiang, He; Huang, Xiaolin et al.
Pattern Recognition and Computer Vision - First Chinese Conference, PRCV 2018, Proceedings. ed. / Jian-Huang Lai; Hongbin Zha; Jie Zhou; Cheng-Lin Liu; Tieniu Tan; Nanning Zheng; Xilin Chen. Springer Verlag, 2018. p. 180-190 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11256 LNCS).

Research output: Chapter in Book/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Violence detection based on spatio-temporal feature and fisher vector

AU - Cai, Huangkai

AU - Jiang, He

AU - Huang, Xiaolin

AU - Yang, Jie

AU - He, Xiangjian

N1 - Publisher Copyright: © Springer Nature Switzerland AG 2018.

PY - 2018

Y1 - 2018

N2 - A novel framework based on local spatio-temporal features and a Bag-of-Words (BoW) model is proposed for violence detection. The framework utilizes Dense Trajectories (DT) and MPEG flow video descriptor (MF) as feature descriptors and employs Fisher Vector (FV) in feature coding. DT and MF algorithms are more descriptive and robust, because they are combinations of various feature descriptors, which describe trajectory shape, appearance, motion and motion boundary, respectively. FV is applied to transform low level features to high level features. FV method preserves much information, because not only the affiliations of descriptors are found in the codebook, but also the first and second order statistics are used to represent videos. Some tricks, that PCA, K-means++ and codebook size, are used to improve the final performance of video classification. In comprehensive consideration of accuracy, speed and application scenarios, the proposed method for violence detection is analysed. Experimental results show that the proposed approach outperforms the state-of-the-art approaches for violence detection in both crowd scenes and non-crowd scenes.

AB - A novel framework based on local spatio-temporal features and a Bag-of-Words (BoW) model is proposed for violence detection. The framework utilizes Dense Trajectories (DT) and MPEG flow video descriptor (MF) as feature descriptors and employs Fisher Vector (FV) in feature coding. DT and MF algorithms are more descriptive and robust, because they are combinations of various feature descriptors, which describe trajectory shape, appearance, motion and motion boundary, respectively. FV is applied to transform low level features to high level features. FV method preserves much information, because not only the affiliations of descriptors are found in the codebook, but also the first and second order statistics are used to represent videos. Some tricks, that PCA, K-means++ and codebook size, are used to improve the final performance of video classification. In comprehensive consideration of accuracy, speed and application scenarios, the proposed method for violence detection is analysed. Experimental results show that the proposed approach outperforms the state-of-the-art approaches for violence detection in both crowd scenes and non-crowd scenes.

KW - Dense Trajectories

KW - Fisher Vector

KW - Linear support vector machine

KW - MPEG flow video descriptor

KW - Violence detection

UR - http://www.scopus.com/inward/record.url?scp=85057121608&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-03398-9_16

DO - 10.1007/978-3-030-03398-9_16

M3 - Conference contribution

AN - SCOPUS:85057121608

SN - 9783030033972

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 180

EP - 190

BT - Pattern Recognition and Computer Vision - First Chinese Conference, PRCV 2018, Proceedings

A2 - Lai, Jian-Huang

A2 - Zha, Hongbin

A2 - Zhou, Jie

A2 - Liu, Cheng-Lin

A2 - Tan, Tieniu

A2 - Zheng, Nanning

A2 - Chen, Xilin

PB - Springer Verlag

T2 - 1st Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2018

Y2 - 23 November 2018 through 26 November 2018

ER -

Cai H, Jiang H, Huang X, Yang J, He X. Violence detection based on spatio-temporal feature and fisher vector. In Lai JH, Zha H, Zhou J, Liu CL, Tan T, Zheng N, Chen X, editors, Pattern Recognition and Computer Vision - First Chinese Conference, PRCV 2018, Proceedings. Springer Verlag. 2018. p. 180-190. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-030-03398-9_16

Violence detection based on spatio-temporal feature and fisher vector

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this