TY - JOUR
T1 - Mde-EvoNAS
T2 - Automatic network architecture design for monocular depth estimation via evolutionary neural architecture search
AU - Yu, Zhihao
AU - Zhang, Haoyu
AU - Liu, Ruyu
AU - Dai, Sheng
AU - Chen, Xinan
AU - Sheng, Weiguo
AU - Jin, Yaochu
N1 - Publisher Copyright:
© 2025 Elsevier B.V.
PY - 2025/3
Y1 - 2025/3
N2 - The advanced performance of the monocular depth estimation model highly relies on features extracted by encoder networks. The encoder architecture in most previous methods reuses networks designed for image classification. It might be sub-optimal because of the gap between the tasks of monocular depth estimation and image classification. However, manually designing task-specific encoders is difficult and tedious for users who lack extensive experience in deep learning. To address this problem, we propose a computationally efficient evolutionary neural architecture search framework, which can automatically design encoders tailored to specific monocular depth estimation tasks. To improve the search efficiency of evolutionary optimization, we construct the search space as a supernet based on the technique of one-shot NAS. In each generation, the supernet is stochastically trained based on the parent population, and each offspring individual inherits weights from the supernet for direct fitness evaluation. Subsequently, we introduce the multiscale spatial feature awareness module within the monocular depth estimation model to leverage channel-wise relationships and positional information derived from multiscale feature maps, enhancing the feature representations of objects across various scales. The experiment results demonstrate the superior performance of our method in both depth benchmark datasets (i.e., KITTI, NYU-Depth v2) and intestinal tissue depth estimation.
AB - The advanced performance of the monocular depth estimation model highly relies on features extracted by encoder networks. The encoder architecture in most previous methods reuses networks designed for image classification. It might be sub-optimal because of the gap between the tasks of monocular depth estimation and image classification. However, manually designing task-specific encoders is difficult and tedious for users who lack extensive experience in deep learning. To address this problem, we propose a computationally efficient evolutionary neural architecture search framework, which can automatically design encoders tailored to specific monocular depth estimation tasks. To improve the search efficiency of evolutionary optimization, we construct the search space as a supernet based on the technique of one-shot NAS. In each generation, the supernet is stochastically trained based on the parent population, and each offspring individual inherits weights from the supernet for direct fitness evaluation. Subsequently, we introduce the multiscale spatial feature awareness module within the monocular depth estimation model to leverage channel-wise relationships and positional information derived from multiscale feature maps, enhancing the feature representations of objects across various scales. The experiment results demonstrate the superior performance of our method in both depth benchmark datasets (i.e., KITTI, NYU-Depth v2) and intestinal tissue depth estimation.
KW - Evolutionary algorithm
KW - Intestinal tissue depth estimation
KW - Monocular depth estimation
KW - Multiscale channel-spatial feature awareness module
KW - Neural architecture search
KW - SuperNet
UR - http://www.scopus.com/inward/record.url?scp=85214303300&partnerID=8YFLogxK
U2 - 10.1016/j.swevo.2024.101837
DO - 10.1016/j.swevo.2024.101837
M3 - Article
AN - SCOPUS:85214303300
SN - 2210-6502
VL - 93
JO - Swarm and Evolutionary Computation
JF - Swarm and Evolutionary Computation
M1 - 101837
ER -