GIANT AI Guide for Educators: How Image Generative AI Works

In the “past,” machines used to come with a manual. Designers of these machines knew exactly what the capability of the machine was, how it worked, things it could do, and things it could not. So, designers would create manuals to indicate exactly how “you” as the human user can use the machine efficiently and effectively.

Well… that’s not the case anymore with AI-powered tools! There is no manual because the designers of these machines don’t actually know what “exactly” the machine has learned and how “exactly” it does things. Sure, we may know an AI model is good at painting (aka generating images), but we don’t exactly know how it does it. If you want to “collaborate” with these machines to create things (like an image of an avatar you have in mind), you need to work with these tools to get to know their capabilities, test them out, and learn the best ways to prompt them — similar to how you would collaborate with a real person. Getting to know your team members, their capabilities, and their communication preferences is the key to a successful collaboration. It is important to note that even though AI-powered tools are designed so that their users “communicate” with them using natural human language (instead of pressing buttons or using code), they are simply tools with their capabilities and limitations.

We do have some background information about AI tools that are trained on generating images. Here is what we know: AI image generators, with Stable Diffusion technology, start with a very noisy image. What does that mean?

Here, to the left, is an image of a cat; a “clean” one with lots of clear details. To the right is a noisy image — you can see the “noise” has obscured part of the cat image. But, you can still see it as a cat. It’s like trying to see a cat in a super foggy or smokey room. You can “almost” see a cat, but your vision is “noisy” and obscured.

Here is a more noisy image of our cat! Things are becoming “noisy”, but you can still see the cat.

How about now?

This one is a very noisy image which doesn’t resemble anything really. Surprisingly, this is what AI image generators start with when generating a new image. It’s drastically different from how you may start a new painting! Humans typically start with a blank clean canvas, and AI tools start with a noisy canvas. Interesting! Interesting! 

Ok, now to “clean up” the image and create a “clean” image, AI tools will pay attention to your “prompts” with many “attention heads.” We don’t know exactly how it makes sense of your instructions (remember, this machine doesn’t come with a manual), but we know it does have the capability to “understand” your prompts. Each “attention head” captures part of your prompt (aka instructions) and decodes it to machine language (they talk in 0s and 1s), and then it tries to use your prompts as the clue to clean up the noisy image.

AI tools do the cleaning process in multiple steps! It cleans some noise, takes a look at the remaining image, then decides if it’s close enough to what you asked for, and if not, it keeps on cleaning the image in more steps until it’s satisfied with the outcome. This is an iterative process, and depending on the model you use, may take seconds to several minutes.

Now, you can use AI tools to generate an image of a “cat.” That’s a simple prompt, but what AI generates as an image of a cat may not end up being close to what you had in mind! You see, cats come in so many different breeds, shapes, sizes, colors, postures, and attitudes! Also, you can draw a cat in hundreds of different styles. So when you ask AI to generate an image of a “cat,” the chances are the cat it generates out of all that noise won’t be similar to what you had in mind. Unless the tool you are using knows you very well and can guess what you have in mind.

Well, one thing that’s for sure is that AI tools have studied more than 650 million images and their descriptions to learn how to generate images you may like! So you never know! Sometimes a very vague prompt like “a cat” may result in AI tools generating a cat image you may LOVE. So sometimes it’s good to be vague with your AI collaborator.

But if you have something very specific in mind, here are our top 5 tips to better communicate with your AI tools using natural language: