High resolution face completion with multiple controllable attributes via fully end-to-end progressive generative adversarial networks

Journal article

Z. Chen, S. Nie, T. Wu, C. G. Healey
arXiv 1801.07632, 2018

Cite

APA Click to copy
Chen, Z., Nie, S., Wu, T., & Healey, C. G. (2018). High resolution face completion with multiple controllable attributes via fully end-to-end progressive generative adversarial networks. ArXiv 1801.07632.

Chicago/Turabian Click to copy
Chen, Z., S. Nie, T. Wu, and C. G. Healey. “High Resolution Face Completion with Multiple Controllable Attributes via Fully End-to-End Progressive Generative Adversarial Networks.” arXiv 1801.07632 (2018).

MLA Click to copy
Chen, Z., et al. “High Resolution Face Completion with Multiple Controllable Attributes via Fully End-to-End Progressive Generative Adversarial Networks.” ArXiv 1801.07632, https://arxiv.org/abs/1801.07632, 2018.

BibTeX Click to copy

@article{z2018a,
  title = {High resolution face completion with multiple controllable attributes via fully end-to-end progressive generative adversarial networks},
  year = {2018},
  journal = {arXiv 1801.07632},
  author = {Chen, Z. and Nie, S. and Wu, T. and Healey, C. G.},
  howpublished = {https://arxiv.org/abs/1801.07632}
}

Abstract

We present a deep learning approach for high resolution face completion with multiple controllable attributes (e.g., male and smiling) under arbitrary masks. Face completion entails understanding both structural meaningfulness and appearance consistency locally and globally to fill in "holes" whose content do not appear elsewhere in an input image. It is a challenging task with the difficulty level increasing significantly with respect to high resolution, the complexity of "holes" and the controllable attributes of filled-in fragments. Our system addresses the challenges by learning a fully end-to-end framework that trains generative adversarial networks (GANs) progressively from low resolution to high resolution with conditional vectors encoding controllable attributes. We design novel network architectures to exploit information across multiple scales effectively and efficiently. We introduce new loss functions encouraging sharp completion. We show that our system can complete faces with large structural and appearance variations using a single feed-forward pass of computation with mean inference time of 0.007 seconds for images at 1024 x 1024 resolution. We also perform a pilot human study that shows our approach outperforms state-of-the-art face completion methods in terms of rank analysis. The code will be released upon publication.