Innovation

A Pixel of Truth? AI Art and Unlimited Possibilities

“A picture is worth a thousand words.” —Fred R. Banard

 

Banard’s adage is still true, but it deserves a coda: Today, a picture can be made with only five words. 

Enter DALL-E 2.

A portmanteau of the artist Salvador Dali and the Pixar character WALL-E, DALL-E 2 is OpenAI’s powerful text to image generator. Type anything you want to see, say, “An astronaut poolside in the style of vector art,” or “A military airplane flying over water during sunset with a ship behind, polaroid,” and within seconds you have four versions of the thing you are trying to visualize. You can then download any of the four images or ask for variations on specific one, and within seconds more pictures arrive.

“A military airplane flying over water during sunset with a ship behind, polaroid,” Created by DALLE-2.

And let’s be clear, it’s not searching the internet for your description. This isn’t Google on steroids. No. The program is making an image of something or someone that doesn’t exist—in seconds. DALLE-2 has been trained on hundreds of millions of images and artistic styles, and when prompted, it weaves together pixels to create a new picture.

After signing up on the waiting list a few weeks ago, I recently gained access to DALL-E 2. And like thousands of other users, I’m amazed at its capability.

It’s also scary.

Text to image generators can create violent images, propaganda, sexually explicit content, and other not so nice things. DALL-E 2, however, has filters that stop you from seeing “A photo of a president ice skating in their underwear.” No. That won’t fly. As OpenAI says, your prompts must be “G Rated.”

But DALL-E 2 is not the only text to image generator. There are other options that are either in beta or coming to market. Some, like Midjourney, will compete with DALL-E 2 in output quality, while other produce less than desired results. Also, their terms of service might differ, with some text to image generators allowing prompts of realistic, real-world people or mature material.

I’ll walk through some prompts to show you the capability of the program. For this audience, the pictures I created used prompts that have a naval theme. Still, you are only limited by your imagination and the program’s terms of service as to the types of prompts you write for the algorithm to make and the images it spits out. And at the end, I’ll finish with some thoughts on the good, the bad, and how we might think about this capability going forward.

AI Art, Styles, and Prompts

For a text to image AI, a simple ask for a picture doesn’t always produce the best results. In fact, it rarely does. The key to getting what you want out of these programs is understand how to ask—that is, how to write prompts for the images you want to see. Thankfully, OpenAI published a prompt book that shows the prompts you can ask to see. The list is huge: you can ask for pictures in the style of oil paintings, watercolor, vector art, stained glass, cartoons, line art, pixel art, photo realism, charcoal, abstract art, and much more. It can copy the style of a particular artist. It can even emulate the color and look of a particular movie.

DALLE 2 shines when asked to create pictures in the style of oil paintings. Whaling villages, aircraft carriers, portraits, war at sea — I tried them all. In the styles of Van Gogh, Rembrandt, Dali, Velázquez, DaVinci, Klimt, and Picasso. Admirers of these artists will spot the differences between the artists and the AI in the brush strokes and lighting and composition. DALLE isn’t perfect; these pictures approximate the style of the artist. Still, the effects are impressive.

Let’s begin.

An oil portrait of a sailor in the style of Rembrandt:

 

An oil painting of an aircraft carrier in the style of DaVinci:

 

What about an aircraft carrier in the style of DaVinci’s notebooks:

Or, “Admiral Horatio Nelson as a muppet, in the style of Pablo Picasso, Blue Period, cubism”:

“An oil painting of a whaling village, piers, ships, sailors walking around, in the style of Rembrandt, wide landscape”:

Now, what about an oil painting of a naval war in the style of Gustav Klimt? (You could have told me it was a painting of the night action in the Solomon Islands campaign and I would have believed you. Here it’s Klimt’s color palette that DALLE gets hold of.)

DALLE 2 can also easily create pictures showing facial expressions. Here’s “an oil portrait of a young sailor with a grin and looking off to the right, amused, in the style of Salvador Dali.”

And with a little Photoshop knowledge, you can download and shrink the original image to create blank canvas space, upload it to DALLE, and then prompt the AI to expand the image focusing on the background, thus making an entire piece of artwork:

“An oil portrait of a young sailor with a grin and looking off to the right, amused, with an ocean and ship in the background, in the style of Salvador Dali,” created by DALLE-2.

Other styles, stained glass and watercolor, for instance, also did well. (Indeed, the stained glass image of the two sailors fighting in the foreground with the ship’s masts in the background is phenomenal.)

An F-18 taking off in the style of watercolor:

It can easily handle a touch of silliness. In this case, let’s create a painting of a sailor playing poker with a rabbit.

 

Or what about some “sailors on a dance floor at a hooka party in the style of VanGogh.”

Photo Realism

Let’s turn to photo realism.

DALLE 2 can create photo realistic images of varying quality. The key for the algorithm is the user needs to specifiy elements of the picture, to include the photo’s camera settings (e.g., focal length, aperture, and even lens maker) and the overall feel of the image you are trying to create (e.g., moody, sunrise, dramatic light). These prompts will help you create some of the best photo realistic images.

An F-18 taking off at sunrise? Sure.

 

 

“An F-18 taking off from a runway, wide angle, dramatic back lighting, Sigma 85mm, f2.8, sunrise.”

What about the inside of a warship? Let’s add some smoke and computer screens..

 

“An image of the inside of a warship, full of smoke, computer screens, moody, Sigma 18mm, f8.0.”

Ok, what about black and white? Easy.

 

“An image of the inside of a warship, full of smoke, computer screens, black and white, Sigma 18mm, f8.0,” created by DALLE 2

Now, let’s take out the smoke and add a water bottle on a table:

“An image of the inside of a warship, computer screens, moody, water bottle on a table, Sigma 18mm, f8.0.”

 

Not All Perfect

DALLE-2 struggles in a few areas. It can’t spell. It can’t make accurate maps. And while it’s great at making logos, it can’t create the text that accompanies those graphics. It also has trouble making a “clean” picture of a human. By “clean” what I mean is that the composition of pixels creates asymmetric images. As you can see from the three daguerreotype pictures that were used as the cover photo for this post, if you look closely, there’s something wrong with their eyes. Still, I decided to tone down the prompt asking for a “big grin” to see if that would return an image that looked better. It did. The style of daguerreotype (slower shutter speeds with objects out of focus, the sword in this case) could easily lead someone to believe that these two men existed once upon a time. Their eyes are in focus just enough, symmetrical just enough, to lead you to believe that any imperfections are simply the accumulation of time and wear on that image in someone’s box of keepsakes.

 

I did try one last thing. I tried to put some missiles on the littoral combat ship. Seriously. I did. But, sadly, that didn’t work.

(I don’t know what this is, but it doesn’t have any missiles.) Created by Dalle 2

 

Final Thoughts

Text to image generators will get better. I suspect we’ll see publications and artists and others use them for many things. Thumbnails for articles, concept art design to inspire others, and sadly but predictably—nefarious purposes ranging from propaganda to online bullying.

Some pictures are meant to inform, some persuade, some entertain, while others help us remember the past or tell a story. The ability to do any of these things without the help of a professional graphic artist and within seconds, will change our world. How big or how small remains to be seen.

This summer, the publishing world hit a few firsts with text to image AIs creating the covers of The Economist and Cosmopolitan magazine (the Cosmo cover doesn’t seem like their brand, but OK, whatever). And the Atlantic magazine used a text to image generator to create an illustration for one of their pieces on the Alex Jones trial. So its here. To close, can you imagine a “synthwave style of a waraship with sun reflection on the sea with a combat ship on the water, digital art” as a cover photo for Proceedings . . . I can.

Back To Top