This example demonstrates how to deploy a Gradio web application that generates images using Stable Diffusion, running on a Ray cluster managed by Ray Serve. The application allows users to input text prompts and receive generated images in response, leveraging GPU resources for efficient inference.Documentation Index
Fetch the complete documentation index at: https://docs.aion.xyz/llms.txt
Use this file to discover all available pages before exploring further.
Prerequisites
- Kubernetes cluster with GPUs (L40 or above recommended)
GPU Operatorsinstalled and configured (e.g., NVIDIA device plugin)KubeRayoperator installed and configuredkubectlconfigured to access the cluster- Hugging Face access token with read permissions for
stabilityai/stable-diffusion-3-medium-diffusers
Step 1: Create the Hugging Face token secret
Step 2: Deploy
- ConfigMap (
stable-diffusion-gradio-code): contains the Stable Diffusion Gradio app code - RayService (
stable-diffusion-gradio): Ray cluster with a head node and GPU worker(s), running the app via Ray Serve
