Abstract
The 3D generative adversarial network (GAN) inversion converts an image into 3D representation to attain highfidelity reconstruction and facilitate realistic image manipulation within the 3D latent space. However, previous approaches face challenges regarding the trade-off between the reconstruction ability and editability. That is, reversing a real-world image to a low-dimensional latent code would inevitably lead to information loss, and achieving a near-perfect reconstruction using highrate triplane representation often limits the ability to manipulate the image freely in the latent space. To address these issues, we propose a novel latent conditioning encoder-based framework with the alignment between the low-dimensional latent and high-dimensional triplane. A non-semantic guided editing strategy bridges the intrinsic relation between the latent condition and triplane generation, making it possible to edit the high-dimensional representation by latent manipulation. As a result, our method can achieve high-fidelity reconstruction and editing simultaneously by directly controlling the latent code. Experimental results demonstrate that our approach excels in reconstruction and editing quality compared to previous 3D inversion methods. Furthermore, our method can also edit even real faces with large poses and out-of-domain cases.
| Original language | English |
|---|---|
| Journal | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings |
| DOIs | |
| Publication status | Published - 2025 |
| Externally published | Yes |
| Event | 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Hyderabad, India Duration: 6 Apr 2025 → 11 Apr 2025 |
Keywords
- 3D GAN inversion
- Image manipulation
- Portrait editing
ASJC Scopus subject areas
- Software
- Signal Processing
- Electrical and Electronic Engineering