TY - JOUR
T1 - Tfdet: Target-aware fusion for rgb-t pedestrian detection
AU - Zhang, Xue
AU - Zhang, Xiaohan
AU - Wang, Jiangtao
AU - Ying, Jiacheng
AU - Sheng, Zehua
AU - Yu, Heng
AU - Li, Chunguang
AU - Shen, Hui Liang
PY - 2024/8/23
Y1 - 2024/8/23
N2 - Pedestrian detection plays a critical role in computer vision as it contributes to ensuring traffic safety. Existing methods that rely solely on RGB images suffer from performance degradation under low-light conditions due to the lack of useful information. To address this issue, recent multispectral detection approaches have combined thermal images to provide complementary information and have obtained enhanced performances. Nevertheless, few approaches focus on the negative effects of false positives (FPs) caused by noisy fused feature maps. Different from them, we comprehensively analyze the impacts of FPs on detection performance and find that enhancing feature contrast can significantly reduce these FPs. In this article, we propose a novel target-aware fusion strategy for multispectral pedestrian detection, named TFDet. The target-aware fusion strategy employs a fusion-refinement paradigm. In the fusion phase, we reveal the parallel-and cross-channel similarities in RGB and thermal features and learn an adaptive receptive field to collect useful information from both features. In the refinement phase, we use a segmentation branch to discriminate the pedestrian features from the background features. We propose a correlation-maximum loss function to enhance the contrast between the pedestrian features and background features. As a result, our fusion strategy highlights pedestrian-related features and suppresses unrelated ones, generating more discriminative fused features. TFDet achieves state-of-the-art performance on two multispectral pedestrian benchmarks, KAIST and LLVIP, with absolute gains of 0.65% and 4.1% over the previous best approaches, respectively. TFDet can easily extend to multiclass object detection scenarios. It outperforms the previous best approaches on two multispectral object detection benchmarks, FLIR and M3FD, with absolute gains of 2.2% and 1.9%, respectively. Importantly, TFDet has comparable inference efficiency to the previous approaches and has remarkably good detection performance even under low-light conditions, which is a significant advancement for ensuring road safety.
AB - Pedestrian detection plays a critical role in computer vision as it contributes to ensuring traffic safety. Existing methods that rely solely on RGB images suffer from performance degradation under low-light conditions due to the lack of useful information. To address this issue, recent multispectral detection approaches have combined thermal images to provide complementary information and have obtained enhanced performances. Nevertheless, few approaches focus on the negative effects of false positives (FPs) caused by noisy fused feature maps. Different from them, we comprehensively analyze the impacts of FPs on detection performance and find that enhancing feature contrast can significantly reduce these FPs. In this article, we propose a novel target-aware fusion strategy for multispectral pedestrian detection, named TFDet. The target-aware fusion strategy employs a fusion-refinement paradigm. In the fusion phase, we reveal the parallel-and cross-channel similarities in RGB and thermal features and learn an adaptive receptive field to collect useful information from both features. In the refinement phase, we use a segmentation branch to discriminate the pedestrian features from the background features. We propose a correlation-maximum loss function to enhance the contrast between the pedestrian features and background features. As a result, our fusion strategy highlights pedestrian-related features and suppresses unrelated ones, generating more discriminative fused features. TFDet achieves state-of-the-art performance on two multispectral pedestrian benchmarks, KAIST and LLVIP, with absolute gains of 0.65% and 4.1% over the previous best approaches, respectively. TFDet can easily extend to multiclass object detection scenarios. It outperforms the previous best approaches on two multispectral object detection benchmarks, FLIR and M3FD, with absolute gains of 2.2% and 1.9%, respectively. Importantly, TFDet has comparable inference efficiency to the previous approaches and has remarkably good detection performance even under low-light conditions, which is a significant advancement for ensuring road safety.
KW - Feature enhancement
KW - multispectral feature fusion
KW - RGB-T object detection
KW - RGB-T pedestrian detection
U2 - 10.1109/TNNLS.2024.3443455
DO - 10.1109/TNNLS.2024.3443455
M3 - Article
SN - 2162-237X
JO - IEEE Transactions on Neural Networks and Learning Systems
JF - IEEE Transactions on Neural Networks and Learning Systems
ER -