TextFace: Text-to-Style Mapping based Face Generation and Manipulation

Xianxu Hou, Xiaokang Zhang, Yudong Li, Linlin Shen

Research output: Journal PublicationArticlepeer-review


As a sub-topic of Text-to-Image synthesis, Text-to-Face generation has a great potential in face related applications. In this paper, we propose a generic Text-to-Face framework, namely TextFace, to achieve diverse and high-quality face image generation from text description. We introduce a novel method called Text-to-Style mapping, where the text description can be directly encoded into the latent space of a pretrained StyleGAN. Guided by our text-image similarity matching and face captioning based text alignment, the textual latent code can be fed into a well-trained StyleGAN's generator, to produce diverse face images with high resolution (1024 1024). Furthermore, our model inherently supports the semantic face editing using text descriptions. Finally, experimental results quantitatively and qualitatively demonstrate the superior performance of our model.

Original languageEnglish
JournalIEEE Transactions on Multimedia
Publication statusAccepted/In press - 2022
Externally publishedYes


  • Codes
  • cross modal
  • Faces
  • GANs
  • Generative adversarial networks
  • Generators
  • Image synthesis
  • Semantics
  • text-guided semantic face manipulation
  • text-to-face generation
  • text-to-image generation
  • Training

ASJC Scopus subject areas

  • Signal Processing
  • Media Technology
  • Computer Science Applications
  • Electrical and Electronic Engineering


Dive into the research topics of 'TextFace: Text-to-Style Mapping based Face Generation and Manipulation'. Together they form a unique fingerprint.

Cite this