Dynamic facial models for video-based dimensional affect estimation

Siyang Song; Enrique Sanchez-Lozano; Mani Kumar Tellamekala; Linlin Shen; Alan Johnston; Michel Valstar

doi:10.1109/ICCVW.2019.00200

Dynamic facial models for video-based dimensional affect estimation

Siyang Song, Enrique Sanchez-Lozano, Mani Kumar Tellamekala, Linlin Shen, Alan Johnston, Michel Valstar

Research output: Chapter in Book/Conference proceeding › Conference contribution › peer-review

20 Citations (Scopus)

Abstract

Dimensional affect estimation from a face video is a challenging task, mainly due to the large number of possible facial displays made up of a set of behaviour primitives including facial muscle actions. The displays vary not only in composition but also in temporal evolution, with each display composed of behaviour primitives with varying in their short and long-term characteristics. Most existing work models affect relies on complex hierarchical recurrent models unable to capture short-term dynamics well. In this paper, we propose to encode these short-term facial shape and appearance dynamics in an image, where only the semantic meaningful information is encoded into the dynamic face images. We also propose binary dynamic facial masks to remove 'stable pixels' from the dynamic images. This process allows filtering of non-dynamic information, i.e. only pixels that have changed in the sequence are retained. Then, the final proposed Dynamic Facial Model (DFM) encodes both filtered facial appearance and shape dynamics of a image sequence preceding to the given frame into a three-channel raster image. A CNN-RNN architecture is tasked with modelling primarily the long-term changes. Experiments show that our dynamic face images achieved superior performance over the standard RGB face images on dimensional affect prediction task.

Original language	English
Title of host publication	Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	1608-1617
Number of pages	10
ISBN (Electronic)	9781728150239
DOIs	https://doi.org/10.1109/ICCVW.2019.00200
Publication status	Published - Oct 2019
Externally published	Yes
Event	17th IEEE/CVF International Conference on Computer Vision Workshop, ICCVW 2019 - Seoul, Korea, Republic of Duration: 27 Oct 2019 → 28 Oct 2019

Publication series

Name	Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019

Conference

Conference	17th IEEE/CVF International Conference on Computer Vision Workshop, ICCVW 2019
Country/Territory	Korea, Republic of
City	Seoul
Period	27/10/19 → 28/10/19

Keywords

Deep learning
Dimensional affect estimation
Facial dynamic modelling

ASJC Scopus subject areas

Computer Science Applications
Computer Vision and Pattern Recognition

Access to Document

10.1109/ICCVW.2019.00200

Cite this

Song, S., Sanchez-Lozano, E., Tellamekala, M. K., Shen, L., Johnston, A., & Valstar, M. (2019). Dynamic facial models for video-based dimensional affect estimation. In Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019 (pp. 1608-1617). Article 9022266 (Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICCVW.2019.00200

Song, Siyang ; Sanchez-Lozano, Enrique ; Tellamekala, Mani Kumar et al. / Dynamic facial models for video-based dimensional affect estimation. Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 1608-1617 (Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019).

@inproceedings{168e671bebab4da3a5f580264c59ac9b,

title = "Dynamic facial models for video-based dimensional affect estimation",

abstract = "Dimensional affect estimation from a face video is a challenging task, mainly due to the large number of possible facial displays made up of a set of behaviour primitives including facial muscle actions. The displays vary not only in composition but also in temporal evolution, with each display composed of behaviour primitives with varying in their short and long-term characteristics. Most existing work models affect relies on complex hierarchical recurrent models unable to capture short-term dynamics well. In this paper, we propose to encode these short-term facial shape and appearance dynamics in an image, where only the semantic meaningful information is encoded into the dynamic face images. We also propose binary dynamic facial masks to remove 'stable pixels' from the dynamic images. This process allows filtering of non-dynamic information, i.e. only pixels that have changed in the sequence are retained. Then, the final proposed Dynamic Facial Model (DFM) encodes both filtered facial appearance and shape dynamics of a image sequence preceding to the given frame into a three-channel raster image. A CNN-RNN architecture is tasked with modelling primarily the long-term changes. Experiments show that our dynamic face images achieved superior performance over the standard RGB face images on dimensional affect prediction task.",

keywords = "Deep learning, Dimensional affect estimation, Facial dynamic modelling",

author = "Siyang Song and Enrique Sanchez-Lozano and Tellamekala, \{Mani Kumar\} and Linlin Shen and Alan Johnston and Michel Valstar",

note = "Publisher Copyright: {\textcopyright} 2019 IEEE.; 17th IEEE/CVF International Conference on Computer Vision Workshop, ICCVW 2019 ; Conference date: 27-10-2019 Through 28-10-2019",

year = "2019",

month = oct,

doi = "10.1109/ICCVW.2019.00200",

language = "English",

series = "Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "1608--1617",

booktitle = "Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019",

address = "United States",

}

Song, S, Sanchez-Lozano, E, Tellamekala, MK, Shen, L, Johnston, A & Valstar, M 2019, Dynamic facial models for video-based dimensional affect estimation. in Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019., 9022266, Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019, Institute of Electrical and Electronics Engineers Inc., pp. 1608-1617, 17th IEEE/CVF International Conference on Computer Vision Workshop, ICCVW 2019, Seoul, Korea, Republic of, 27/10/19. https://doi.org/10.1109/ICCVW.2019.00200

Dynamic facial models for video-based dimensional affect estimation. / Song, Siyang; Sanchez-Lozano, Enrique; Tellamekala, Mani Kumar et al.
Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019. Institute of Electrical and Electronics Engineers Inc., 2019. p. 1608-1617 9022266 (Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019).

Research output: Chapter in Book/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Dynamic facial models for video-based dimensional affect estimation

AU - Song, Siyang

AU - Sanchez-Lozano, Enrique

AU - Tellamekala, Mani Kumar

AU - Shen, Linlin

AU - Johnston, Alan

AU - Valstar, Michel

PY - 2019/10

Y1 - 2019/10

N2 - Dimensional affect estimation from a face video is a challenging task, mainly due to the large number of possible facial displays made up of a set of behaviour primitives including facial muscle actions. The displays vary not only in composition but also in temporal evolution, with each display composed of behaviour primitives with varying in their short and long-term characteristics. Most existing work models affect relies on complex hierarchical recurrent models unable to capture short-term dynamics well. In this paper, we propose to encode these short-term facial shape and appearance dynamics in an image, where only the semantic meaningful information is encoded into the dynamic face images. We also propose binary dynamic facial masks to remove 'stable pixels' from the dynamic images. This process allows filtering of non-dynamic information, i.e. only pixels that have changed in the sequence are retained. Then, the final proposed Dynamic Facial Model (DFM) encodes both filtered facial appearance and shape dynamics of a image sequence preceding to the given frame into a three-channel raster image. A CNN-RNN architecture is tasked with modelling primarily the long-term changes. Experiments show that our dynamic face images achieved superior performance over the standard RGB face images on dimensional affect prediction task.

AB - Dimensional affect estimation from a face video is a challenging task, mainly due to the large number of possible facial displays made up of a set of behaviour primitives including facial muscle actions. The displays vary not only in composition but also in temporal evolution, with each display composed of behaviour primitives with varying in their short and long-term characteristics. Most existing work models affect relies on complex hierarchical recurrent models unable to capture short-term dynamics well. In this paper, we propose to encode these short-term facial shape and appearance dynamics in an image, where only the semantic meaningful information is encoded into the dynamic face images. We also propose binary dynamic facial masks to remove 'stable pixels' from the dynamic images. This process allows filtering of non-dynamic information, i.e. only pixels that have changed in the sequence are retained. Then, the final proposed Dynamic Facial Model (DFM) encodes both filtered facial appearance and shape dynamics of a image sequence preceding to the given frame into a three-channel raster image. A CNN-RNN architecture is tasked with modelling primarily the long-term changes. Experiments show that our dynamic face images achieved superior performance over the standard RGB face images on dimensional affect prediction task.

KW - Deep learning

KW - Dimensional affect estimation

KW - Facial dynamic modelling

UR - http://www.scopus.com/inward/record.url?scp=85082497480&partnerID=8YFLogxK

U2 - 10.1109/ICCVW.2019.00200

DO - 10.1109/ICCVW.2019.00200

M3 - Conference contribution

AN - SCOPUS:85082497480

T3 - Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019

SP - 1608

EP - 1617

BT - Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 17th IEEE/CVF International Conference on Computer Vision Workshop, ICCVW 2019

Y2 - 27 October 2019 through 28 October 2019

ER -

Song S, Sanchez-Lozano E, Tellamekala MK, Shen L, Johnston A, Valstar M. Dynamic facial models for video-based dimensional affect estimation. In Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019. Institute of Electrical and Electronics Engineers Inc. 2019. p. 1608-1617. 9022266. (Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019). doi: 10.1109/ICCVW.2019.00200

Dynamic facial models for video-based dimensional affect estimation

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this