admin@publications.scrs.in   
Advancements in Communication and Systems

Text-to-Image Generation using Generative Adversarial Network

Authors: Chitvan Jamdagni, Vasu Sharma, Jatin Goyal, Divyans h, Baddam Nikhil Kumar Reddy and Payal Thakur


Publishing Date: 27-02-2024

ISBN: 978-81-955020-7-3

DOI: https://doi.org/10.56155/978-81-955020-7-3-19

Abstract

A deep learning model called Text-to-Image Creation with Generative Adversarial Networks (GAN) can generate images from text descriptions. A wide range of applications, including photo-searching, photo-editing, art creation, computer-aided design, image reconstruction, captioning, and portrait drawing, are among the several study fields that it has a significant impact on. Producing realistic visuals consistently under predetermined settings is the most difficult endeavor. Current text-to-image creation algorithms produce images that don't accurately reflect the text. The Caltech-UCSD Birds-200-2011 dataset was used to train the suggested model, and an inception score and PSNR were used to assess its performance. Stage-I and Stage-II make up the proposed StackGAN paradigm. Based on the input written description, Stage-I GAN generates low-resolution images by the method of roughing out the basic shape and colors of the object. By using the Stage-I results and textual descriptions as inputs, together with defect detection and detail addition, Stage-II GAN creates high-resolution and photo-realistic images with fine details.

Keywords

GAN, Generative Adversarial Networks, Deep Learning, Text to Image generation (T2I).

Cite as

Chitvan Jamdagni, Vasu Sharma, Jatin Goyal, Divyans h, Baddam Nikhil Kumar Reddy and Payal Thakur, "Text-to-Image Generation using Generative Adversarial Network", In: Ashish Kumar Tripathi and Vivek Shrivastava (eds), Advancements in Communication and Systems, SCRS, India, 2024, pp. 209-217. https://doi.org/10.56155/978-81-955020-7-3-19

Recent