TY - GEN
T1 - Innovate Spatial-Temporal Attention Network (STAN) for Accurate 3D Mice Pose Estimation with a Single Monocular RGB Camera
AU - Gong, Liyun
AU - Yu, Miao
AU - Kashyap, Gautam Siddharth
AU - McCall, Sheldon
AU - Thota, Mamatha
AU - Ardakani, Saeid Pourroostaei
N1 - Publisher Copyright:
© 2024 European Signal Processing Conference, EUSIPCO. All rights reserved.
PY - 2024
Y1 - 2024
N2 - Precise 3D pose estimation of mice holds crucial importance across various scientific domains. In this research, we introduce an innovative model named the Spatial-Temporal Attention Network (STAN), specifically designed for accurate 3D pose estimation of mice using a single monocular camera. The STAN model leverages a sequence of extracted 2D skeletons to predict the 3D pose of a mouse. Through the incorporation of spatial and temporal attention modules, our STAN methodology adeptly captures intricate spatial and temporal relationships among key points, thereby enabling a comprehensive representation of the dynamic movements inherent in a mouse's behavior for precise 3D pose estimation. To assess the effectiveness of our proposed method, extensive experimental evaluations were undertaken. The results show the superior performance of the STAN model when compared to other state-of-the-art approaches within the realm of 3D mouse pose estimation.
AB - Precise 3D pose estimation of mice holds crucial importance across various scientific domains. In this research, we introduce an innovative model named the Spatial-Temporal Attention Network (STAN), specifically designed for accurate 3D pose estimation of mice using a single monocular camera. The STAN model leverages a sequence of extracted 2D skeletons to predict the 3D pose of a mouse. Through the incorporation of spatial and temporal attention modules, our STAN methodology adeptly captures intricate spatial and temporal relationships among key points, thereby enabling a comprehensive representation of the dynamic movements inherent in a mouse's behavior for precise 3D pose estimation. To assess the effectiveness of our proposed method, extensive experimental evaluations were undertaken. The results show the superior performance of the STAN model when compared to other state-of-the-art approaches within the realm of 3D mouse pose estimation.
KW - computer vision
KW - deep learning
KW - mice 3D pose estimation
KW - multi-head attention
KW - temporal/spatial information
KW - transformer
UR - http://www.scopus.com/inward/record.url?scp=85208435143&partnerID=8YFLogxK
U2 - 10.23919/eusipco63174.2024.10715128
DO - 10.23919/eusipco63174.2024.10715128
M3 - Conference contribution
AN - SCOPUS:85208435143
T3 - European Signal Processing Conference
SP - 616
EP - 620
BT - 32nd European Signal Processing Conference, EUSIPCO 2024 - Proceedings
PB - European Signal Processing Conference, EUSIPCO
T2 - 32nd European Signal Processing Conference, EUSIPCO 2024
Y2 - 26 August 2024 through 30 August 2024
ER -