As of June 2024, if you are a paid subscriber to OpenAI, you have access to its image creation tool, DALL·E.
The question that comes to mind is: “How different is image prompting compared to text prompting”?
The answer is a resounding “a lot”, but don’t get intimidated.
Similar to our blog post “AI prompting: Hands-On Experimentation vs Structured Frameworks”, the best advice for mastering image AI prompting is to dive in and start experimenting.
You can create quality pictures even with broad prompts, such as “ChatGPT, create a picture to accompany my blog post on AI Ethics.”
ChatGPT will create an image deciding on all of the small and big details by itself. Below, find the image I received:
Your learning path has just begun as ChatGPT offers visibility into the exact image prompt used. In the above case, the exact prompt was:
A thought-provoking image depicting the concept of AI ethics. The scene shows a humanoid robot holding a scale in one hand and a glowing, transparent brain in the other. The background includes a blend of futuristic cityscape and natural elements, symbolizing the balance between technology and nature. The robot appears thoughtful, emphasizing the moral considerations in AI development. Subtle binary code and circuit patterns are integrated into the background, adding a sense of digital complexity. The overall atmosphere is balanced and reflective, highlighting the importance of ethical decision-making in AI.
The level of detail may initially blow your mind, but don’t overthink it. Every time ChatGPT generates an image, spend 5 seconds reviewing the image prompting. Without noticing, you’ll start providing more robust prompts right from the get-go.
If you are an experienced photographer, you can create more realistic images of people or scenery by specifying particular details such as:
Example prompt:
Capture a panoramic shot of a group of hikers at the peak of a snowy mountain at dawn. The subjects should be dressed in colorful winter gear, expressing triumph and camaraderie. The scene should include a clear, expansive view of the mountain range in the background. Use golden hour lighting to enhance the warmth and glow of the early morning sun. Camera settings should include a wide-angle lens, low ISO for crisp detail, and a slow shutter speed to capture the serene atmosphere and subtle movements of the hikers.
In my initial attempt, the above prompt resulted in the following image:
In the image above, there is a flying camera and a floating stick next to the central hiker’s left arm. ChatGPT offers an Edit option where users can select the area for editing and provide instructions for the edit. The result of the editing function can be shown below.
Quite impressive, don’t you think?
While spontaneous experimentation is invaluable, having a few structured approaches can guide your initial attempts and enhance your results. Here are some frameworks designed specifically for image prompting:
The INSPIRE framework helps in crafting detailed and precise prompts that lead to high-quality images.
• Idea: Define the core concept or theme of your image.
• Nuance: Specify subtle details and elements.
• Style: Describe the visual style (e.g., realism, abstraction).
• Parameters: Set technical specifications like resolution and dimensions.
• Intent: Convey the purpose or mood you want the image to evoke.
• Refinement: Include any fine-tuning details or adjustments.
• Execution: Provide instructions on how the AI should generate the image.
Example INSPIRE Prompt:
“Create a surreal landscape of floating islands in the sky. The islands should have lush greenery and waterfalls cascading into the clouds. The style should be whimsical and dreamlike, with vibrant colors. Ensure the resolution is high for printing purposes. The image should evoke a sense of wonder and fantasy.”
The VISUAL framework is designed to guide the AI in creating images that are visually coherent and aligned with your creative vision.
• Vision: Define the overall vision or theme.
• Input: Provide specific elements and details to include.
• Style: Describe the desired artistic style.
• Utility: Indicate the image’s intended use or function.
• Aesthetic: Outline the aesthetic qualities and mood.
• Layout: Specify the arrangement and composition of elements.
Example VISUAL Prompt:
“Design a concept art piece for a futuristic cityscape at night. Include towering skyscrapers with neon lights, flying vehicles, and bustling streets. The style should be cyberpunk, with a focus on dark, moody colors and high contrast. The image will be used as a background for a sci-fi video game, so it needs to convey a sense of energy and intrigue.”
The CREATE framework focuses on ensuring that every aspect of the image is well-defined and cohesive.
• Concept: Outline the main idea or theme.
• Requirements: Specify any necessary technical details.
• Elements: List key components to be included.
• Atmosphere: Describe the mood or tone.
• Technique: Define the artistic approach or style.
• Execution: Provide detailed instructions for generating the image.
Example CREATE Prompt:
“Generate a detailed illustration of an enchanted forest. The forest should have glowing plants, mythical creatures, and a misty atmosphere. The style should be a mix of realism and fantasy, with intricate details. The image needs to be suitable for a book cover, with a sense of mystery and magic.”
Don’t let initial uncertainties hold you back. Start with simple ideas, be clear and specific, and refine your prompts as you go. The more you interact with ChatGPT, the more skilled you’ll become at utilizing its features to bring your artistic visions to life.
Happy creating!