ManiGAN: Text-Guided Image Manipulation

Li, Bowen; Qi, Xiaojuan; Lukasiewicz, Thomas; Torr, Philip H. S.

Computer Science > Computer Vision and Pattern Recognition

arXiv:1912.06203 (cs)

[Submitted on 12 Dec 2019 (v1), last revised 30 Mar 2020 (this version, v2)]

Title:ManiGAN: Text-Guided Image Manipulation

Authors:Bowen Li, Xiaojuan Qi, Thomas Lukasiewicz, Philip H. S. Torr

View PDF

Abstract:The goal of our paper is to semantically edit parts of an image matching a given text that describes desired attributes (e.g., texture, colour, and background), while preserving other contents that are irrelevant to the text. To achieve this, we propose a novel generative adversarial network (ManiGAN), which contains two key components: text-image affine combination module (ACM) and detail correction module (DCM). The ACM selects image regions relevant to the given text and then correlates the regions with corresponding semantic words for effective manipulation. Meanwhile, it encodes original image features to help reconstruct text-irrelevant contents. The DCM rectifies mismatched attributes and completes missing contents of the synthetic image. Finally, we suggest a new metric for evaluating image manipulation results, in terms of both the generation of new attributes and the reconstruction of text-irrelevant contents. Extensive experiments on the CUB and COCO datasets demonstrate the superior performance of the proposed method. Code is available at this https URL.

Comments:	CVPR 2020
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:1912.06203 [cs.CV]
	(or arXiv:1912.06203v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1912.06203

Submission history

From: Bowen Li [view email]
[v1] Thu, 12 Dec 2019 20:48:52 UTC (9,535 KB)
[v2] Mon, 30 Mar 2020 19:42:35 UTC (17,838 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:ManiGAN: Text-Guided Image Manipulation

Submission history

Access Paper:

Current browse context:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:ManiGAN: Text-Guided Image Manipulation

Submission history

Access Paper:

Current browse context:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators