New Tools for Text-to-Image Generation

القائمة الرئيسية

الصفحات

Generating an image from a line of text entirely by means computer algorithms has been possible for the last few years. Newly invented tools are yielding results that keep getting more interesting.

The images can be hauntingly surrealistic, such as this one, which was generated by the phrase “when the wind blows.” 

Image courtesy The Big Sleep (source@advadnoun on Twitter)

It's a little blurry and out of focus, with tendrils of downy fluff waving in dim light. It seems more like a photograph than a painting, but really it's a new category of image, made by computer software drawing from big data sets. 

Lately people's imaginations have been captured by tools such as VQ-GAN and CLIP.



Prompt: “a face like an M.C. Escher drawing” from The Big Sleep (source: @advadnoun on Twitter)

Some of the results are compelling and intriguing, seemingly intelligent in a weird non-human way, as if you're looking into an alien's mind. Is that a face on its side, an eye, a nose, a mouth? Are those textures fingerprints? 

Prompt: “The Yellow Smoke That Rubs Its Muzzle On The Window-Panes” 
from VQ-GAN+CLIP (source: @RiversHaveWings on Twitter)

Each solution has a visual logic of theme and variation that's carried throughout the image. It's certainly not random. 

Prompt: “A Series Of Tubes” from VQ-GAN+CLIP (source@RiversHaveWings on Twitter)

Many of the images from this system have a surrealistic patchwork appearance resembling Cubism, where extracted fragments are juxtaposed across the picture plane, but the 3D space doesn't make sense as a real scene.


(source: @ak92501 on Twitter)

Some of the creativity of this enterprise derives from the odd juxtapositions of the words in the prompts. The results are often effective with long prompts. The phrase for the image above is “a small hut in a blizzard near the top of a mountain with one light turn on at dusk trending on artstation | unreal engine”

In recent weeks, people writing prompts realized you can get the system to yield a more detailed style if you say "trending on artstation."  

Prompt: "matte painting of someone reading papers and burning the midnight oil | trending on artstation" 
by Twitter user @ak92501

I expect that with time the results will be accepted alongside human efforts, beginning perhaps with categories like motel art, Twitter avatars, and corporate clip art. They will take their place on Instagram alongside painters and photographers. Many of the innovators in this field write their own code and come up with remarkably creative prompts, so it makes sense to think of them as artists.

As a viewer, I'm not quite sure how to respond emotionally to something that looks like art, but which didn't pass through a human consciousness.

As an artist, I'm not worried about my job. Maybe it's a vain hope, but I feel like people will always want to see images made by a human hand and filtered through a human brain rather than one made by an unfeeling machine. The question is whether eventually we'll be able to tell the difference.
--
Thanks, Chris!

Resources to learn more:
• UC Berkeley blog post, which is a good overview of techniques: Alien Dreams: An Emerging Art Scene
• Twitter account "Images.AI"  which plays with these natural language prompts and some of the same tools.

تعليقات