FILE: POST_0085.SYS

Dall-E 101

Wall-e long los cousin

AUTHOR: Dukeroo

DATE: January 12, 2026

Skills Network Logo 

DALL-E: Image generation Guide for Beginners

Estimated time needed: 30 minutes

In this lab, you will learn how to use DALL-E series to generate images from text.

NOTE: Due to environment limitations, currently only the prompt can be modified; edit and variation features are not available at this time.

Table of Contents

  1. Introduction
  2. What does this guided project do?
  3. Objectives
  4. Background
    1. What is large language model (LLM)?
    2. What is multimodal?
    3. What is Dall-E 2?
    4. What is Dall-E 3?
  5. Setup
    1. Installing required libraries
  6. Image generation
    1. Which model should I use?
    2. Generations
    3. Edits (Dall-E 2 only)
    4. Variations (Dall-E 2 only)
  7. Practice
    1. Use Dall-E 2 to generate an image of a cat
    2. Use Dall-E 3 to generate an image of a cat
  8. Compare the two images
  9. Exercises
    1. Exercise 1: Generate another image using Dall-E 2
    2. Exercise 2: Generate another image using Dall-E 3
  10. Authors
  11. Contributors

Introduction

Have you ever wanted to create stunning images from just a text description? With the power of AI image generation, this is now possible. In this project, we'll explore DALL·E series, OpenAI's revolutionary text-to-image model that can create realistic images and art from natural language descriptions.

What does this guided project do?

This project demonstrates how to use DALL·E series to generate images by:

  1. Crafting effective text prompts that describe the images you want to create
  2. Using the OpenAI API to generate images from these prompts
  3. Exploring different parameters to control the image generation process

For example, you could input a prompt like "a serene landscape with mountains reflected in a lake at sunset" and DALL·E will create a beautiful image matching your description. This technology can be used for creating illustrations, concept art, design mockups, or simply exploring your creative ideas in visual form.

Objectives

After completing this lab you will be able to:

  • Craft effective prompts for DALL·E image generation
  • Use the OpenAI API to generate images from text descriptions
  • Understand the parameters that control image generation
  • Save and use the generated images in your projects

Background

What is large language model (LLM)?

Large language models are a category of foundation models trained on immense amounts of data making them capable of understanding and generating natural language and other types of content to perform a wide range of tasks.

What is multimodal?

Multimodal refers to the capability of a model to process and understand multiple types of data simultaneously. In the context of AI and machine learning, multimodal models can handle and integrate information from various modalities, such as:

  • Text to image: Generating images based on textual descriptions, as seen in models like DALL·E.
  • Text to audio: Converting written text into spoken words or sounds.
  • Image to ext: Analyzing images to produce descriptive text or captions.
  • Audio to text: Transcribing spoken language into written text.
  • Video analysis: Understanding and interpreting video content by integrating visual and audio data.

This capability allows for a more comprehensive and nuanced understanding and generation of content. For example, a multimodal AI system can take a text description and generate a corresponding image or analyze an image and generate descriptive text. This integration of different types of data enables more sophisticated applications and interactions, such as creating detailed visual content from textual descriptions or providing richer context in conversational AI systems.

What is Dall-E 2?

DALL·E 2 is an AI system developed by OpenAI that can create realistic images and art from text descriptions. Released in 2022, it's the successor to the original DALL·E model. Key features include:

  • Text-to-image generation: Creates images from natural language descriptions
  • Image editing: Allows for modifications to existing images
  • Variations: Can generate multiple variations of an image
  • Resolution control: Creates images at different resolutions
  • Proprietary technology: Unlike open-source models, DALL·E 2 is a commercial product from OpenAI

What is Dall-E 3?

DALL·E 3 is OpenAI's most advanced text-to-image model, released in 2023. It represents a significant improvement over DALL·E 2 with the following key features:

  • Higher quality images: Produces more detailed, accurate, and visually stunning images
  • Better text understanding: More accurately interprets complex prompts and follows specific instructions
  • Text rendering: Significantly improved ability to generate readable text within images
  • Artistic styles: Better at capturing specific artistic styles and visual aesthetics
  • Safety features: Enhanced content filtering and safety measures
  • Integration with ChatGPT: Can be accessed directly through ChatGPT to refine prompts interactively

DALL·E 3 can generate images at higher resolutions and with greater fidelity to the user's intent, making it particularly valuable for professional creative work and detailed visualizations.

Setup

For this lab, you will be using the following libraries:

  • openai: openai is a library that allows working with the OpenAI API.

Installing required libraries

 

Restart kernel

%pip install openai==1.64.0 | tail -n 1
Successfully installed jiter-0.12.0 openai-1.64.0
Note: you may need to restart the kernel to use updated packages.

Image generation

The Images API has three endpoints with different abilities:

  • Generations: Images from scratch, based on a text prompt
  • Edits: Edited versions of images, where the model replaces some areas of a pre-existing image, based on a new text prompt
  • Variations: Variations of an existing image

Which model should I use?

DALL·E 2 and DALL·E 3 have different options for generating images.

ModelAvailable endpointsBest for
DALL·E 2Generations, edits, variationsMore options (edits and variations), more control in prompting, more requests at once
DALL·E 3Only image generationsHigher quality, larger sizes for generated images

Generations

The image generations endpoint allows you to create an original image with a text prompt. Each image can be returned either as a URL or Base64 data, using the response_format parameter. The default output is URL, and each URL expires after an hour.

Size and quality options

Square, standard quality images are the fastest to generate. The default size of generated images is 1024x1024 pixels, but each model has different options:

ModelSizes options (pixels)Quality optionsRequests you can make
DALL·E 2256x256 512x512 1024x1024Only standardUp to 10 images at a time, with the n parameter
DALL·E 31024x1024 1024x1792 1792x1024Defaults to standard Set quality: "hd" for enhanced detailOnly 1 at a time, but can request more by making parallel requests

Edits (Dall-E 2 only)

The image edits endpoint lets you edit or extend an image by uploading an image and mask indicating which areas should be replaced. This process is also known as inpainting.

The transparent areas of the mask indicate where the image should be edited, and the prompt should describe the full new image, not just the erased area.

ImageMaskOutput

Prompt: a sunlit indoor lounge area with a pool containing a flamingo

The uploaded image and mask must both be square PNG images, less than 4MB in size, and have the same dimensions as each other. The non-transparent areas of the mask aren't used to generate the output, so they don’t need to match the original image like our example.

Variations (Dall-E 2 only)

The image variations endpoint allows you to generate a variation of a given image.

ImageOutput

Similar to the edits endpoint, the input image must be a square PNG image less than 4MB in size.

Practice

Use Dall-E 2 to generate an image of a cat

Please use the following prompt: "a white siamese cat"

from openai import OpenAI
from IPython import display

client = OpenAI()

response = client.images.generate(
    model="dall-e-2",
    prompt="a white siamese cat",
    size="1024x1024",
    # quality="standard",
    n=1,
)

url = response.data[0].url
display.Image(url=url, width=512)

Use Dall-E 3 to generate an image of a cat

Please use the same prompt: "a white siamese cat"

from openai import OpenAI
from IPython import display

client = OpenAI()

response = client.images.generate(
    model="dall-e-3",
    prompt="a white siamese cat",
    size="1024x1024",
    quality="standard",
    n=1,
)

url = response.data[0].url
display.Image(url=url, width=512)

Compare the two images

Dall e 2 is more realistic.

Exercises

Exercise 1: Generate another image using Dall-E 2

Please generate another image using DALL·E 2.

Please use the following prompt: "a beautiful lake with a sunset"

# Your code here
response = client.images.generate(
    model="dall-e-2",
    prompt="a beautiful lake with a sunset",
    size="1024x1024",
    # quality="standard",
    n=1,
)

url = response.data[0].url
display.Image(url=url, width=512)

Click to show solution
from openai import OpenAI
from IPython import display

client = OpenAI()

response = client.images.generate(
    model="dall-e-2",
    prompt="a beautiful lake with a sunset",
    size="1024x1024",
    n=1,
)

url = response.data[0].url
display.Image(url=url, width=512)

Exercise 2: Generate another image using Dall-E 3

Please generate another image using DALL·E 3.

Please use the following prompt: "a beautiful lake with a sunset"

# Your code here
response = client.images.generate(
    model="dall-e-3",
    prompt="a beautiful lake with a sunset",
    size="1024x1024",
    # quality="standard",
    n=1,
)

url = response.data[0].url
display.Image(url=url, width=512)

Click here for Solution
from openai import OpenAI
from IPython import display

client = OpenAI()

response = client.images.generate(
    model="dall-e-3",
    prompt="a beautiful lake with a sunset",
    size="1024x1024",
    quality="standard",
    n=1,
)

url = response.data[0].url
display.Image(url=url, width=512)

Authors

Ricky Shi
Hailey Quach

 

[COMMENTS: 0]

> [LOGIN] TO LEAVE A COMMENT

> NO_COMMENTS_FOUND

BE THE FIRST TO UPLOAD YOUR THOUGHTS