text to image models

21 models · ranked by HuggingFace downloads

stable-diffusion-v1-5

stable-diffusion-v1-5 converts natural language descriptions into images. The model samples from a learned distribution, with output quality influenced by sampling steps and guidance scale.

1,764,267 ↓ · 1,154 ♡

SDXL Base 1.0 is Stability AI's flagship text-to-image diffusion model, operating at 1024x1024 native resolution with a dual-text-encoder architecture. It produces significantly higher-quality images than SD 1.5 and 2.x, especially for complex compositions.

1,417,752 ↓ · 7,832 ♡

FLUX.1-dev

FLUX.1-dev converts natural language descriptions into images. The model samples from a learned distribution, with output quality influenced by sampling steps and guidance scale.

1,104,500 ↓ · 13,286 ♡

Z-Image-Turbo

Z-Image-Turbo converts natural language descriptions into images. The model samples from a learned distribution, with output quality influenced by sampling steps and guidance scale.

935,475 ↓ · 4,840 ♡

dreamshaper-7

dreamshaper-7 is a community fine-tune of Stable Diffusion 1.5 by Lykon, optimized for photorealistic portraits, artistic illustrations, and anime-adjacent styles. Version 7 is a direct successor improving skin texture, lighting coherence, and reducing NSFW content leakage versus earlier releases. It uses the CreativeML OpenRAIL-M license which restricts some commercial uses.

927,805 ↓ · 62 ♡

sd-turbo

sd-turbo generates images from text prompts using a diffusion process. Starting from random noise, it iteratively denoises conditioned on the prompt embedding.

821,422 ↓ · 454 ♡

HunyuanImage-3.0

HunyuanImage-3.0 is Tencent's third-generation text-to-image diffusion model using a Mixture-of-Experts transformer backbone. It targets photorealistic and stylised image generation with improved prompt adherence over the previous HunyuanImage series. The MoE architecture selectively activates expert layers per token, balancing quality and compute.

732,554 ↓ · 1,093 ♡

sdxl-turbo

sdxl-turbo is a text-conditioned diffusion model. Prompt sensitivity is high — small wording changes can produce notably different results.

703,323 ↓ · 2,587 ♡

Qwen-Image-Lightning

Qwen-Image-Lightning is a distilled or accelerated variant of a Qwen vision-language model targeting faster image-text inference. The 'Lightning' naming suggests latency optimization, likely through model distillation or quantization.

507,738 ↓ · 804 ♡

FLUX.1-schnell

FLUX.1-schnell converts natural language descriptions into images. The model samples from a learned distribution, with output quality influenced by sampling steps and guidance scale.

453,273 ↓ · 5,013 ♡

novaAnimeXL_ilV140

NovaAnimeXL ilV140 is a Stable Diffusion XL fine-tune focused on anime-style image generation. It uses the SDXL pipeline format and targets character illustration in anime aesthetics. The model card provides no training dataset disclosure or license terms beyond the diffusers compatibility tags.

435,160 ↓ · 3 ♡

stable-diffusion-v1-4

stable-diffusion-v1-4 is a text-conditioned diffusion model. Prompt sensitivity is high — small wording changes can produce notably different results.

420,423 ↓ · 7,024 ♡

Realistic_Vision_V5.1_noVAE

Realistic Vision V5.1 is a photorealism-focused Stable Diffusion 1.5 fine-tune that has accumulated substantial community use for portrait and product photography generation. The 'noVAE' variant ships without the VAE weights, requiring users to supply a separate VAE (typically the SD 1.5 base VAE or the EMA840k variant), which reduces checkpoint file size. It is designed for integration into A1111, InvokeAI, and ComfyUI workflows.

409,913 ↓ · 251 ♡

playground-v2.5-1024px-aesthetic

playground-v2.5-1024px-aesthetic is an open-source text-to-image model available on HuggingFace. Details are sourced from the public model registry.

357,992 ↓ · 765 ♡

sdxl-turbo

sdxl-turbo converts natural language descriptions into images. The model samples from a learned distribution, with output quality influenced by sampling steps and guidance scale.

334,892 ↓ · 3 ♡

stable-diffusion-3.5-medium

Stable Diffusion 3.5 Medium is Stability AI's mid-tier SD3 variant using a Multimodal Diffusion Transformer (MMDiT) architecture. At a smaller parameter count than SD3.5-Large, it offers faster generation while maintaining SD3's improved text rendering and prompt adherence over SD2/SDXL. License restricts commercial use above certain revenue thresholds.

327,141 ↓ · 961 ♡

dvine82-xl

Dvine82-XL is a Stable Diffusion XL fine-tune from martineux targeting a specific aesthetic style — likely photorealistic portraits or fine-art imagery based on the naming convention. Community SDXL fine-tunes in this naming pattern typically focus on model aesthetics rather than prompt-following improvements.

325,559 ↓ · 0 ♡

stable-diffusion-v1-5

stable-diffusion-v1-5 is a text-conditioned diffusion model. Prompt sensitivity is high — small wording changes can produce notably different results.

320,862 ↓ · 1 ♡

stable-diffusion-xl-1.0-inpainting-0.1

This is an SDXL-based inpainting model from HuggingFace Diffusers, fine-tuned specifically for masked region infilling using Stable Diffusion XL's 1024px native resolution. Unlike SD 1.5 inpainting models, the SDXL base enables generating higher-resolution inpaints that blend more naturally with surrounding image context. The model uses the StableDiffusionXLInpaintPipeline.

307,203 ↓ · 374 ♡

diving-illustrious-real-asian-v50-sdxl

diving-illustrious-real-asian-v50-sdxl is an open-source text-to-image model available on HuggingFace. Details are sourced from the public model registry.

291,003 ↓ · 0 ♡

one-obsession-17-red-sdxl

one-obsession-17-red-sdxl is an open-source text-to-image model available on HuggingFace. Details are sourced from the public model registry.

287,039 ↓ · 3 ♡

Search

text to image models

stable-diffusion-v1-5

stable-diffusion-xl-base-1.0

FLUX.1-dev

Z-Image-Turbo

dreamshaper-7

sd-turbo

HunyuanImage-3.0

sdxl-turbo

Qwen-Image-Lightning

FLUX.1-schnell

novaAnimeXL_ilV140

stable-diffusion-v1-4

Realistic_Vision_V5.1_noVAE

playground-v2.5-1024px-aesthetic

sdxl-turbo

stable-diffusion-3.5-medium

dvine82-xl

stable-diffusion-v1-5

stable-diffusion-xl-1.0-inpainting-0.1

diving-illustrious-real-asian-v50-sdxl

one-obsession-17-red-sdxl