One of the criticisms of AI art, or generative art, is that you type in a few words and the machine spits out a beautiful image stolen from other artists. I’m working on another post about the technology behind AI art. But, in this post, I want to examine AI art as a craft, as a skill.
Of course, it’s not the same process, craft, and skills of traditional art. AI art does not take years of practice. But that’s also the nature of artificial intelligence. And I will have a lot more to say about how AI will change everyone’s lives and work in the 21st century. AI is not going away. In fact, AI is on an exponential growth curve. Everyone should seriously consider how AI will impact their work. Big changes are coming. Those changes are not the topic of this essay, but I feel the need to say that for those who hate what is happening in AI art.
Let’s look at AI art as an iterative process by talking through how I ended up with this specific image:
The inspiration came from reading Dante’s Divine Comedy. In canto 14 of the Inferno there’s the parable of the Old Man of Crete and the line “who looks at Rome as it were his mirror.” I wanted to make a variation on that scene. I started with the prompt:
a withered mountain hiding a tearful, cracked old man looking at Rome in a mirror --v 4
That prompt resulted in this initial set of images:
This first grid of images were not particularly good. The first image in the grid, in the upper left, has possibility but none of these appealed to me. So much for just typing in a few words and getting a spectacular image.
I rerolled to get a new grid of 4 images. (Rerolling is the process of generating a new set of images that often go in a different style):
I didn’t find any of these very satisfying, either.
I modified the prompt to emphasize certain words; I was still in my early days of working with midjourney and had yet figured out how best to manage prompts:
a withered mountain::4 hiding a tearful, cracked old man::2 looking at Rome in a mirror::1 --v 4
This prompt resulted in some nice images of mountains, which makes sense because the number 4 in the prompt gives additional weight to the word “mountain”:
Good concept art but not what I was looking to accomplish.
I modified the prompt a bit more in an attempt to give more emphasis to the word “man” and “mirror”:
a withered mountain::4 hiding a tearful, cracked old man::3 looking at Rome in a mirror::2 --v 4
And I got a new grid of 4 images that gave me 4 more mountains. Again, nice concept art but not what I was looking for in this scene:
I do like all these images as landscapes.
I rerolled to get another grid of mountains:
Again, fantastic concept art of landscapes but what happened to the old man?
I thought about that and changed the prompt to take the emphasis off of mountain:
a withered mountain hiding a tearful, cracked old man::3 looking at Rome in a mirror::2 --v 4
Now, this is when I got excited.
Number 3, in the lower left, is absolutely spectacular.
I love that image.
But I decided to reroll the whole set to see what else would come out:
Okay, this is really going somewhere that I really like. Let’s look more closely at the 4th image, lower right corner:
This image is superb. I love the texture on the side of the face. But I rerolled the set again to get another grid:
That’s it. As soon as I saw the 4th image (lower right corner) in the grid, I knew I had the one I wanted. But rather than stopping and upscaling that image. I chose to see what variations came up for that image.
After some deliberation, I chose to upscale the 2nd image (upper right) in the grid.
And then I did a beta upscale redo on this image:
But I wasn’t done yet. I then chose to do the detailed upscale redo. And that gave me the final image.
The process of developing this image took 12 steps. That’s 12 steps of making decisions. Decisions about the prompt, changing the prompt, modifying the prompt, rerolling the image grid again and again, selecting a candidate and then upscaling, and then beta upscaling, and then detailing upscaling.
The upscaling is not always a solution. Sometimes upscaling loses a lot of the details that you really loved about an image. Sometimes upscaling enhances an image wonderfully.
As I review my process for this image, I realize that I did not go one step further. I do not try the remaster option in my original process. I’m going to try that and see what happens.
I don’t care for the remaster versions. I could try remastering again and see where it goes. But you have to know when to stop. You can keep rerolling and remastering for hours, especially if you are coming up with some very good image variations. Once you hit on a specific technique that is working, the process becomes addictive.
Generating art through artificial intelligence algorithms is a medium. You have to learn the techniques, you have to practice, you have to make decisions, editorial and artistic choices. If you stop too soon, you may not reach that amazing work that’s hidden among the algorithms. If you go too far, you may destroy the very thing you love. Fortunately, you can always step back to previous steps.
Enjoy the process! Art is all about process. If you enjoy working with AI art, then that’s all that matters.