TY - GEN
T1 - HPMatte
T2 - 8th IEEE International Conference on Vision, Image and Signal Processing, ICVISP 2024
AU - Guan, Shouqin
AU - Lu, Yifan
AU - Fu, Yukun
AU - Mareta, Sannia
AU - Zhang, Zhiwang
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Image matting is a critical yet complex task in computer vision. Recent advancements in deep learning have significantly improved performance in this domain. However, the majority of these approaches rely on a trimap as an auxiliary input, which restricts their usability in practical applications. While some methods have attempted to remove the dependency on trimaps, their matting quality generally lags behind that of trimap-assisted models. The absence of trimap guidance often leads to foreground-background ambiguities and imprecise details in transition areas. To address these limitations, we introduce HPMatte, a novel matting framework combining Transformer and CNN (Convolutional Neural Network) architectures. This hybrid model achieves high-quality matting results without the requirement of a trimap input. The key component of our model is the Pyramid Semantic Block (PSB), which extracts features at different scales, fuses information from different resolutions, and preserves fine details of the foreground. This enables high-precision matting of natural images. Additionally, A background dataset called GBG-10k is introduced which enhances the diversity of existing matting datasets. The approach is evaluated by using two well-known benchmark datasets, i.e., AM-2k and P3M-10k. The experimental outcomes highlight the advantages of HPMatte compared to existing approaches.
AB - Image matting is a critical yet complex task in computer vision. Recent advancements in deep learning have significantly improved performance in this domain. However, the majority of these approaches rely on a trimap as an auxiliary input, which restricts their usability in practical applications. While some methods have attempted to remove the dependency on trimaps, their matting quality generally lags behind that of trimap-assisted models. The absence of trimap guidance often leads to foreground-background ambiguities and imprecise details in transition areas. To address these limitations, we introduce HPMatte, a novel matting framework combining Transformer and CNN (Convolutional Neural Network) architectures. This hybrid model achieves high-quality matting results without the requirement of a trimap input. The key component of our model is the Pyramid Semantic Block (PSB), which extracts features at different scales, fuses information from different resolutions, and preserves fine details of the foreground. This enables high-precision matting of natural images. Additionally, A background dataset called GBG-10k is introduced which enhances the diversity of existing matting datasets. The approach is evaluated by using two well-known benchmark datasets, i.e., AM-2k and P3M-10k. The experimental outcomes highlight the advantages of HPMatte compared to existing approaches.
KW - CNN
KW - deep learning
KW - natural image matting
KW - transformer
KW - trimap-free
UR - https://www.scopus.com/pages/publications/105004644330
U2 - 10.1109/ICVISP64524.2024.10959701
DO - 10.1109/ICVISP64524.2024.10959701
M3 - Conference contribution
AN - SCOPUS:105004644330
T3 - 2024 IEEE 8th International Conference on Vision, Image and Signal Processing, ICVISP 2024
BT - 2024 IEEE 8th International Conference on Vision, Image and Signal Processing, ICVISP 2024
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 27 December 2024 through 29 December 2024
ER -