
You do not need a $3,000 workstation to generate high-quality AI images. A cloud GPU instance gives you access to the same hardware in minutes, at a fraction of the cost. Here is how to set it up.
Sarah Chen
Solutions Architect · LightYear Cloud
Stable Diffusion, Flux, and the growing ecosystem of open-source image generation models have democratised AI art — but they are GPU-hungry. Running SDXL or Flux.1 at a comfortable speed requires at least 8 GB of VRAM, and generating high-resolution images or using ControlNet pipelines pushes that requirement to 16–24 GB. For most people, that means either a high-end gaming GPU or a cloud alternative.
Cloud GPU instances offer a compelling middle ground: pay for GPU time only when you are actively generating, access hardware that would cost thousands of dollars to buy outright, and scale up to larger GPUs for more demanding workflows without any hardware upgrade cycle.
The primary constraint for image generation is VRAM. More VRAM means you can load larger models, use higher batch sizes, and generate at higher resolutions without running out of memory. Here is a practical guide to GPU selection for common image generation tasks:
| Workload | Min VRAM | Recommended GPU |
|---|---|---|
| SD 1.5 / SDXL (512–1024px) | 8 GB | A16 (16 GB) |
| Flux.1 Dev / Schnell | 12 GB | A16 (16 GB) or A40 |
| SDXL + ControlNet pipeline | 16 GB | A40 (48 GB) |
| High-res batch generation (2048px+) | 24 GB | A40 or L40S |
| Video generation (Wan2.1, CogVideoX) | 40 GB+ | A100 80 GB or L40S |
ComfyUI is the most popular node-based interface for Stable Diffusion and compatible models. It offers fine-grained control over the generation pipeline and supports a vast ecosystem of custom nodes for ControlNet, IP-Adapter, video generation, and more. Here is how to get it running on a cloud GPU instance.
Step 1: Deploy a GPU instance. Provision an instance with at least 16 GB of VRAM — an A16 or A40 works well for most ComfyUI workflows. Choose Ubuntu 22.04 and ensure CUDA drivers are installed. LightYear GPU instances come with CUDA pre-configured.
Step 2: Install ComfyUI. Clone the ComfyUI repository from GitHub and install its Python dependencies. The installation takes under five minutes. ComfyUI's built-in manager makes it easy to install additional custom nodes without manual dependency management.
Step 3: Download your models. Use wget or the Hugging Face CLI to download model checkpoints directly to the instance. SDXL base and refiner are around 6 GB each; Flux.1 Dev is approximately 24 GB. Downloading directly to the instance is much faster than uploading from your local machine, as cloud instances typically have 1–10 Gbps network connections.
Step 4: Expose the ComfyUI port. ComfyUI runs on port 8188 by default. Use SSH port forwarding (ssh -L 8188:localhost:8188 user@your-instance-ip) to access the interface securely from your local browser without exposing the port publicly.
Step 5: Generate. Load your workflow, connect your model, and start generating. On an A40, SDXL generates a 1024×1024 image in approximately 3–5 seconds with 20 steps. Flux.1 Schnell produces comparable quality in 4 steps, making batch generation extremely fast.
The economics of cloud vs. local GPU depend heavily on usage patterns. If you generate images for several hours every day, a local GPU with 24 GB of VRAM (such as an RTX 4090) will pay for itself within a year compared to equivalent cloud GPU time. But if your usage is intermittent — a few hours per week, or occasional large batch jobs — cloud GPU is almost always more cost-effective.
Cloud GPU also wins on flexibility: you can access 48 GB or 80 GB GPUs for demanding workflows that would be impossible on consumer hardware, then drop back to a smaller instance for everyday generation. There is no hardware to maintain, no driver conflicts, and no waiting for a new GPU generation to become available.
One important consideration when using cloud GPU instances for image generation is persistence. Instance storage is ephemeral on many platforms — if you destroy the instance, your models and outputs go with it. The recommended approach is to attach a block storage volume to your instance and store all model checkpoints and generated images there. Block storage persists independently of the instance, so you can detach it, destroy the GPU instance, and re-attach it to a new instance when you are ready to continue.
LightYear's Block Storage add-on integrates directly with GPU instances, making it straightforward to maintain a persistent model library across sessions while only paying for GPU time when you are actively generating.
On-demand NVIDIA GPU servers billed by the hour. No contracts, no minimum spend. Spin up in under 60 seconds.