Optimizing diffusion models
Provides a list of papers focusing on optimizing T2I diffusion models, targeting fewer timesteps, architecture optimization, and more.
Paper • 2202.00512 • Published • 1Note Introduces the idea of progressively distilling a diffusion model that requires fewer timesteps to sample from.
On Distillation of Guided Diffusion Models
Paper • 2210.03142 • Published
InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation
Paper • 2309.06380 • Published • 32Note Combines the benefits of timestep distillation and faster ODE solver-based sampling. Custom diffusers pipeline: https://github.com/huggingface/diffusers/blob/main/examples/community/instaflow_one_step.py.
Consistency Models
Paper • 2303.01469 • Published • 8Note Introduces a new framework for distillation by learning to map any point in a probability flow ordinary differential equation (ODE )to its origin on the trajectory. CMs also allow for 1-4 step sampling. Play with them: https://huggingface.co/docs/diffusers/api/pipelines/consistency_models. Unconditional consistency model training: https://github.com/huggingface/diffusers/tree/main/examples/research_projects/consistency_training.
Improved Techniques for Training Consistency Models
Paper • 2310.14189 • Published
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference
Paper • 2310.04378 • Published • 19Note Extends Consistency Models by operating on the latent space. Also introduces Latent Consistency Fine-tuning for training on custom datasets. Play with them: https://huggingface.co/docs/diffusers/main/en/api/pipelines/latent_consistency_models Train your own: https://github.com/huggingface/diffusers/tree/main/examples/consistency_distillation
LCM-LoRA: A Universal Stable-Diffusion Acceleration Module
Paper • 2311.05556 • Published • 82Note Builds on top of LCM and acts as a plugin for pre-trained Stable Diffusion models to enable faster few-step inference. Play with them: https://huggingface.co/docs/diffusers/main/en/using-diffusers/inference_with_lcm_lora Train your own: https://github.com/huggingface/diffusers/tree/main/examples/consistency_distillation
UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs
Paper • 2311.09257 • Published • 45Note Enables one-step sampling through a GAN forward pass without running the reverse process. Community implementation: https://github.com/huggingface/diffusers/pull/6133.
Adversarial Diffusion Distillation
Paper • 2311.17042 • Published • 3Note Introduces a way to do GAN-style training to enable few-step inference. Combining GANs and Diffusion isn't new. Refer to this paper for references. This paper made the SD-Turbo and SDXL-Turbo possible. Play with them: https://huggingface.co/docs/diffusers/using-diffusers/sdxl_turbo. A training script is also being added here: https://github.com/huggingface/diffusers/pull/6303.
Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation
Paper • 2403.12015 • Published • 65Note Successor of ADD (Adversarial Diffusion Distillation).
On Architectural Compression of Text-to-Image Diffusion Models
Paper • 2305.15798 • Published • 4Note Popularly known as "BK-SDM". Up until now, we were mainly focusing on distilling to enable few-step inference. Those techniques necessarily don't concentrate on architectural compression. BK-SDM models (compatible with diffusers): https://huggingface.co/nota-ai
Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer Level Loss
Paper • 2401.02677 • Published • 22Note Builds on top of BK-SDM. This report presents an overview of the things that made SSD-1B and Vega family of architecturally compressed models work so well. Find the models: https://huggingface.co/Segmind
SDXL-Lightning: Progressive Adversarial Diffusion Distillation
Paper • 2402.13929 • Published • 28Note Combines progressive and adversarial distillation to achieve a balance between quality and mode coverage. Available checkpoints: https://huggingface.co/ByteDance/SDXL-Lightning (compatible with `diffusers`).
Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis
Paper • 2404.13686 • Published • 28Note `diffusers` compatible implementation: https://hyper-sd.github.io/.
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models
Paper • 2404.14507 • Published • 22Note `diffusers` integrated implementation: https://research.nvidia.com/labs/toronto-ai/AlignYourSteps/.
SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions
Paper • 2403.16627 • Published • 20
Token Merging for Fast Stable Diffusion
Paper • 2303.17604 • PublishedNote It gradually merges redundant tokens, thereby accelerating inference. It supports diffusers: https://huggingface.co/docs/diffusers/main/en/optimization/tome.
DeepCache: Accelerating Diffusion Models for Free
Paper • 2312.00858 • Published • 21Note It shows how the UNet features computed in the earlier reverse diffusion steps can be largely reused during the latter steps. This saves computation, thereby accelerating inference. It supports diffusers: https://huggingface.co/docs/diffusers/main/en/optimization/deepcache.
thibaud/sdxl_dpo_turbo
Text-to-Image • Updated • 487 • 83Note What if you could combine two models coming from the same family? In this example, SDXL Turbo params and SDXL DPO params are averaged and we obtain a new model. This model enjoys alignment benefits from DPO and few-step sampling from Turbo.
Deci-early-access/DeciDiffusion-v2-0
Text-to-Image • Updated • 5 • 3Note DeciDiffusion-v2 is a faster model than Stable Diffusion v1.5 while producing similar-caliber image quality. It benefits from an improved training recipe. Refer to the model card to know more.
jasperai/flash-sd3
Text-to-Image • Updated • 672 • 109