ComfyUI and Stable Diffusion running on a cloud GPU instance
Back to Blog
GPU Cloud7 min readApril 1, 2026

How to Run Stable Diffusion on a Cloud GPU (Without a Powerful Local Machine)

You do not need a $3,000 workstation to generate high-quality AI images. A cloud GPU instance gives you access to the same hardware in minutes, at a fraction of the cost. Here is how to set it up.

SC

Sarah Chen

Solutions Architect · LightYear Cloud

Stable Diffusion, Flux, and the growing ecosystem of open-source image generation models have democratised AI art — but they are GPU-hungry. Running SDXL or Flux.1 at a comfortable speed requires at least 8 GB of VRAM, and generating high-resolution images or using ControlNet pipelines pushes that requirement to 16–24 GB. For most people, that means either a high-end gaming GPU or a cloud alternative.

Cloud GPU instances offer a compelling middle ground: pay for GPU time only when you are actively generating, access hardware that would cost thousands of dollars to buy outright, and scale up to larger GPUs for more demanding workflows without any hardware upgrade cycle.

Choosing the Right GPU for Image Generation

The primary constraint for image generation is VRAM. More VRAM means you can load larger models, use higher batch sizes, and generate at higher resolutions without running out of memory. Here is a practical guide to GPU selection for common image generation tasks:

Workload Min VRAM Recommended GPU
SD 1.5 / SDXL (512–1024px)8 GBA16 (16 GB)
Flux.1 Dev / Schnell12 GBA16 (16 GB) or A40
SDXL + ControlNet pipeline16 GBA40 (48 GB)
High-res batch generation (2048px+)24 GBA40 or L40S
Video generation (Wan2.1, CogVideoX)40 GB+A100 80 GB or L40S

Setting Up ComfyUI on a Cloud GPU Instance

ComfyUI is the most popular node-based interface for Stable Diffusion and compatible models. It offers fine-grained control over the generation pipeline and supports a vast ecosystem of custom nodes for ControlNet, IP-Adapter, video generation, and more. Here is how to get it running on a cloud GPU instance.

Step 1: Deploy a GPU instance. Provision an instance with at least 16 GB of VRAM — an A16 or A40 works well for most ComfyUI workflows. Choose Ubuntu 22.04 and ensure CUDA drivers are installed. LightYear GPU instances come with CUDA pre-configured.

Step 2: Install ComfyUI. Clone the ComfyUI repository from GitHub and install its Python dependencies. The installation takes under five minutes. ComfyUI's built-in manager makes it easy to install additional custom nodes without manual dependency management.

Cloud GPU AI image generation platform showing GPU instance selection and pricing

Step 3: Download your models. Use wget or the Hugging Face CLI to download model checkpoints directly to the instance. SDXL base and refiner are around 6 GB each; Flux.1 Dev is approximately 24 GB. Downloading directly to the instance is much faster than uploading from your local machine, as cloud instances typically have 1–10 Gbps network connections.

Step 4: Expose the ComfyUI port. ComfyUI runs on port 8188 by default. Use SSH port forwarding (ssh -L 8188:localhost:8188 user@your-instance-ip) to access the interface securely from your local browser without exposing the port publicly.

Step 5: Generate. Load your workflow, connect your model, and start generating. On an A40, SDXL generates a 1024×1024 image in approximately 3–5 seconds with 20 steps. Flux.1 Schnell produces comparable quality in 4 steps, making batch generation extremely fast.

Cloud GPU vs. Local GPU for Image Generation

The economics of cloud vs. local GPU depend heavily on usage patterns. If you generate images for several hours every day, a local GPU with 24 GB of VRAM (such as an RTX 4090) will pay for itself within a year compared to equivalent cloud GPU time. But if your usage is intermittent — a few hours per week, or occasional large batch jobs — cloud GPU is almost always more cost-effective.

Cloud GPU also wins on flexibility: you can access 48 GB or 80 GB GPUs for demanding workflows that would be impossible on consumer hardware, then drop back to a smaller instance for everyday generation. There is no hardware to maintain, no driver conflicts, and no waiting for a new GPU generation to become available.

Saving Your Work Between Sessions

One important consideration when using cloud GPU instances for image generation is persistence. Instance storage is ephemeral on many platforms — if you destroy the instance, your models and outputs go with it. The recommended approach is to attach a block storage volume to your instance and store all model checkpoints and generated images there. Block storage persists independently of the instance, so you can detach it, destroy the GPU instance, and re-attach it to a new instance when you are ready to continue.

LightYear's Block Storage add-on integrates directly with GPU instances, making it straightforward to maintain a persistent model library across sessions while only paying for GPU time when you are actively generating.

GPU Cloud — NVIDIA A16, A40, A100, L40S

Deploy a GPU instance on LightYear

On-demand NVIDIA GPU servers billed by the hour. No contracts, no minimum spend. Spin up in under 60 seconds.

Your cookie choices for this website

This site uses cookies and related technologies, as described in our privacy policy, for purposes that may include site operation, analytics, and enhanced user experience. You may choose to consent to our use of these technologies, or manage your own preferences. Cookie policy