TY - GEN
T1 - Augmented Feature Representation with Parallel Convolution for Cross-domain Facial Expression Recognition
AU - Yang, Fan
AU - Xie, Weicheng
AU - Zhong, Tao
AU - Hu, Jingyu
AU - Shen, Linlin
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - Facial expression recognition (FER) has made significant progress in the past decade, but the inconsistency of distribution between different datasets greatly limits the generalization performance of a learned model on unseen datasets. Recent works resort to aligning feature distributions between domains to improve the cross-domain recognition performance. However, current algorithms use one output each layer for the feature representation, which can not well represent the complex correlation among multi-scale features. To this end, this work proposes a parallel convolution to augment the representation ability of each layer, and introduces an orthogonal regularization to make each convolution represent independent semantic. With the assistance of a self-attention mechanism, the proposed algorithm can generate multiple combinations of multi-scale features to allow the network to better capture the correlation among the outputs of different layers. The proposed algorithm achieves state-of-the-art (SOTA) performances in terms of the average generalization performance on the task of cross-database (CD)-FER. Meanwhile, when AFED or RAF-DB is used for the training, and other four databases, i.e. JAFFE, SFEW, FER2013 and EXPW are used for testing, the proposed algorithm outperforms the baselines by the margins of 5.93% and 2.24% in terms of the average accuracy.
AB - Facial expression recognition (FER) has made significant progress in the past decade, but the inconsistency of distribution between different datasets greatly limits the generalization performance of a learned model on unseen datasets. Recent works resort to aligning feature distributions between domains to improve the cross-domain recognition performance. However, current algorithms use one output each layer for the feature representation, which can not well represent the complex correlation among multi-scale features. To this end, this work proposes a parallel convolution to augment the representation ability of each layer, and introduces an orthogonal regularization to make each convolution represent independent semantic. With the assistance of a self-attention mechanism, the proposed algorithm can generate multiple combinations of multi-scale features to allow the network to better capture the correlation among the outputs of different layers. The proposed algorithm achieves state-of-the-art (SOTA) performances in terms of the average generalization performance on the task of cross-database (CD)-FER. Meanwhile, when AFED or RAF-DB is used for the training, and other four databases, i.e. JAFFE, SFEW, FER2013 and EXPW are used for testing, the proposed algorithm outperforms the baselines by the margins of 5.93% and 2.24% in terms of the average accuracy.
KW - Domain generalization
KW - Facial expression recognition
KW - Parallel convolution
KW - Self-attention
UR - http://www.scopus.com/inward/record.url?scp=85144582448&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-20233-9_30
DO - 10.1007/978-3-031-20233-9_30
M3 - Conference contribution
AN - SCOPUS:85144582448
SN - 9783031202322
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 297
EP - 306
BT - Biometric Recognition - 16th Chinese Conference, CCBR 2022, Proceedings
A2 - Deng, Weihong
A2 - Feng, Jianjiang
A2 - Zheng, Fang
A2 - Huang, Di
A2 - Kan, Meina
A2 - Sun, Zhenan
A2 - He, Zhaofeng
A2 - Wang, Wenfeng
PB - Springer Science and Business Media Deutschland GmbH
T2 - 16th Chinese Conference on Biometric Recognition, CCBR 2022
Y2 - 11 November 2022 through 13 November 2022
ER -