Beginner

Project Setup

Before writing any code, you need to understand the architecture of an AI image generation app, decide between cloud APIs and local models, and set up your development environment. This lesson covers all three.

What You Will Build

By the end of this project, you will have a fully working web application that:

Accepts text prompts from users and generates images using Stable Diffusion
Enhances prompts automatically using an LLM for better results
Displays generated images in a responsive gallery with download support
Supports advanced features like image-to-image, inpainting, and upscaling
Deploys to production with Docker, rate limiting, and cost controls

Architecture Overview

The application follows a clean three-tier architecture:

+------------------+     +------------------+     +-------------------+
|   Frontend       |     |   Backend        |     |   AI Services     |
|   (HTML/CSS/JS)  |---->|   (FastAPI)      |---->|   (Stability AI / |
|                  |     |                  |     |    Replicate)     |
|  - Prompt input  |     |  - API routes    |     |  - Text-to-image  |
|  - Gallery view  |     |  - Prompt enhance|     |  - Img-to-img     |
|  - History panel |     |  - Image storage |     |  - Inpainting     |
|  - Downloads     |     |  - Rate limiting |     |  - Upscaling      |
+------------------+     +------------------+     +-------------------+
                                |
                          +-----v------+
                          |  Storage   |
                          |  (Local /  |
                          |   S3/CDN)  |
                          +------------+

The frontend sends prompts to the FastAPI backend. The backend enhances the prompt if requested, forwards it to the image generation API, saves the result, and returns the image URL to the frontend.

Stable Diffusion API vs Local Models

You have two main options for running image generation. Here is how they compare:

Factor	Cloud API (Stability AI / Replicate)	Local (ComfyUI / Diffusers)
Setup time	5 minutes	1-2 hours
GPU required	No	Yes (8GB+ VRAM)
Cost per image	$0.002-$0.01	Free (after hardware)
Generation speed	2-5 seconds	5-30 seconds (depends on GPU)
Model variety	Latest models available	Any model you download
Scalability	Scales automatically	Limited to your hardware
Best for	Production apps, teams	Experimentation, privacy

💡

This project uses cloud APIs (Stability AI and Replicate) so you can follow along without a GPU. The code is structured so you can swap in a local model later if you prefer.

Tech Stack

Here is every tool you will use in this project:

Python 3.10+ — Primary backend language
FastAPI — High-performance async web framework for the API
Stability AI SDK / Replicate Python client — Image generation APIs
OpenAI Python client — For LLM-powered prompt enhancement
Pillow (PIL) — Image processing, resizing, format conversion
HTML / CSS / JavaScript — Frontend (no framework needed)
Docker — Containerization for deployment
SQLite — Lightweight database for prompt history and metadata

Setting Up Your Environment

Step 1: Create the Project Directory

mkdir ai-image-generator
cd ai-image-generator
mkdir -p static/images templates

Step 2: Set Up a Virtual Environment

python -m venv venv

# On macOS/Linux:
source venv/bin/activate

# On Windows:
venv\Scripts\activate

Step 3: Install Dependencies

Create a requirements.txt file:

# requirements.txt
fastapi==0.110.0
uvicorn[standard]==0.27.1
python-multipart==0.0.9
stability-sdk==0.8.5
replicate==0.25.1
openai==1.14.0
Pillow==10.2.0
python-dotenv==1.0.1
aiofiles==23.2.1
jinja2==3.1.3

Install everything:

pip install -r requirements.txt

Step 4: Configure API Keys

Create a .env file in the project root. You will need at least one image generation API key:

# .env
STABILITY_API_KEY=your-stability-ai-key-here
REPLICATE_API_TOKEN=your-replicate-token-here
OPENAI_API_KEY=your-openai-key-here

⚠

Never commit your .env file. Add it to your .gitignore immediately. API keys in version control are a security risk and can result in unexpected charges.

Get your API keys from:

Stability AI: https://platform.stability.ai/account/keys — Free tier includes 25 credits
Replicate: https://replicate.com/account/api-tokens — Pay per prediction, starts free
OpenAI: https://platform.openai.com/api-keys — For prompt enhancement (GPT-3.5 is cheap)

Step 5: Create the Project Skeleton

Create a minimal main.py to verify everything works:

# main.py
from fastapi import FastAPI
from fastapi.staticfiles import StaticFiles
from fastapi.templating import Jinja2Templates
from dotenv import load_dotenv
import os

load_dotenv()

app = FastAPI(title="AI Image Generator")
app.mount("/static", StaticFiles(directory="static"), name="static")
templates = Jinja2Templates(directory="templates")


@app.get("/health")
async def health_check():
    return {
        "status": "ok",
        "stability_key": bool(os.getenv("STABILITY_API_KEY")),
        "replicate_key": bool(os.getenv("REPLICATE_API_TOKEN")),
        "openai_key": bool(os.getenv("OPENAI_API_KEY")),
    }

Step 6: Test the Setup

uvicorn main:app --reload

Visit http://localhost:8000/health in your browser. You should see a JSON response confirming your API keys are loaded:

{
  "status": "ok",
  "stability_key": true,
  "replicate_key": true,
  "openai_key": true
}

Project File Structure

By the end of this project, your directory will look like this:

ai-image-generator/
├── main.py                 # FastAPI application entry point
├── routers/
│   ├── generate.py         # Image generation endpoints
│   └── gallery.py          # Gallery and history endpoints
├── services/
│   ├── image_service.py    # Image generation logic
│   ├── prompt_service.py   # Prompt enhancement logic
│   └── storage_service.py  # Image storage and retrieval
├── templates/
│   └── index.html          # Main web interface
├── static/
│   ├── css/
│   │   └── style.css       # Application styles
│   ├── js/
│   │   └── app.js          # Frontend JavaScript
│   └── images/             # Generated images stored here
├── requirements.txt        # Python dependencies
├── Dockerfile              # Container configuration
├── docker-compose.yml      # Multi-service setup
├── .env                    # API keys (not committed)
└── .gitignore              # Git ignore rules

📌

Checkpoint: You should now have a running FastAPI server at localhost:8000 with your API keys loaded. The /health endpoint confirms everything is connected. In the next lesson, you will build the core image generation API.

Next → Image Generation API