TextFace: Text-to-Style Mapping Based Face Generation and Manipulation

Xianxu Hou, Xiaokang Zhang, Yudong Li, Linlin Shen

Research output: Journal PublicationArticlepeer-review

3 Citations (Scopus)


As a subtopic of text-to-image synthesis, text-to-face generation has great potential in face-related applications. In this paper, we propose a generic text-to-face framework, namely, TextFace, to achieve diverse and high-quality face image generation from text descriptions. We introduce text-to-style mapping, a novel method where the text description can be directly encoded into the latent space of a pretrained StyleGAN. Guided by our text-image similarity matching and face captioning-based text alignment, the textual latent code can be fed into the generator of a well-trained StyleGAN to produce diverse face images with high resolution (1024×1024). Furthermore, our model inherently supports semantic face editing using text descriptions. Finally, experimental results quantitatively and qualitatively demonstrate the superior performance of our model.

Original languageEnglish
Pages (from-to)3409-3419
Number of pages11
JournalIEEE Transactions on Multimedia
Publication statusPublished - 2023
Externally publishedYes


  • GANs
  • cross modal
  • text-guided semantic face manipulation
  • text-to-face generation
  • text-to-image generation

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering
  • Media Technology
  • Computer Science Applications


Dive into the research topics of 'TextFace: Text-to-Style Mapping Based Face Generation and Manipulation'. Together they form a unique fingerprint.

Cite this