Skip to content

Stable Diffusion

  • Deeplearning Text (prompt) to image model
  • Released in 2022
  • Based on diffusion techniques
  • Frozen CLIP ViT-L/14 text encoder

Installation

# Macos dependencies
brew install cmake protobuf rust [email protected] git wget

# Clone repo
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui

# Install and run
./stable-diffusion-webui/webui.sh

Configuration

  • Config executation: webui-user.sh
./webui.sh \
  --share
  --disable-nan-check \
  --no-half \
  --api \
  --vae-path=<path> \
  --no-half-vae
# same as passing directly via parameter
export COMMANDLINE_ARGS=""

Models

Checkpoints

  • Stable Diffusion models are known as checkpoints models
  • .safetensors extension
# Copy your own checkpoint models
cp "Counterfeit-V2.5.safetensors" ./models/Stable-diffusion

VAE

  • There is also another image enhancer which is the VAE (Variation Auto Encoder) model. It is the .vae.pt files
# Copy VAE
cp "Counterfeit-V2.5.vae.pt" ./models/VAE

Embeddings

Parameters

txt2img

  • Prompt
  • https://safebooru.org/
  • Tags and common keywords for building the prompts
  • Negative Prompt
  • Keyword that you want to avoid in the image
  • Sampling Method
  • The algorithm to produce images
  • Batch count
  • Number of drawings in the same image (no impact on performance)
  • Batch size
  • Number of images to generate (impacts on performance)

img2img

  • Upload a picture to be used as a base

Extensions

LoRa

ControlNet

API

  • Requireds the --api flag
# txt2img
curl -X POST http://127.0.0.1:7860/sdapi/v1/txt2img \
  -d '{
        "prompt": "puppy dog",
        "negative_prompt": "sad",
        "steps": 5
      }'
  • The response is the list of images base64 encoded
// response
{
  "images": ["...", "..."]
}