LinkGAN: Linking GAN Latents to Pixels for Controllable Image Synthesis

Abstract

This work presents an easy-to-use regularizer for GAN training, which helps explicitly link some axes of the latent space to an image region or a semantic category (e.g., sky) in the synthesis.Establishing such a connection facilitates a more convenient local control of GAN generation, where users can alter image content only within a spatial area simply by partially resampling the latent codes. Experimental results confirm four appealing properties of our regularizer, which we call LinkGAN.

(1) Any image region can be linked to the latent space, even if the region is pre-selected before training and fixed for all instances. (2) Two or multiple regions can be independently linked to different latent axes, surprisingly allowing tokenized control of synthesized images. (3) Our regularizer can improve the spatial controllability of both 2D and 3D GAN models, barely sacrificing the synthesis performance. (4) The models trained with our regularizer are compatible with GAN inversion techniques and maintain editability on real images.

Results

Single Arbitrary Region on FFHQ

Single Arbitrary Region on AFHQ

Single Arbitrary Region on Church

Mutiply Regions on FFHQ

Semantic Region on Church/Car

Arbitrary Region on EG3D

BibTeX

@article{zhu2022linkgan,
    title   = {LinkGAN: Linking {GAN} Latents to Pixels for Controllable Image Synthesis},
    author  = {Zhu, Jiapeng and Yang, Ceyuan and Shen, Yujun and Shi, Zifan and Dai, Bo and Zhao, Deli and Chen, Qifeng},
    journal = {International Conference on Computer Vision (ICCV)},
    year    = {2023}
}

The website template was adapted from Nerfies. We thank the authors for sharing the templates.