Skip to main content
This example demonstrates how to deploy a Gradio web application that generates images using Stable Diffusion, running on a Ray cluster managed by Ray Serve. The application allows users to input text prompts and receive generated images in response, leveraging GPU resources for efficient inference.

Prerequisites

  • Kubernetes cluster with GPUs (L40 or above recommended)
  • GPU Operators installed and configured (e.g., NVIDIA device plugin)
  • KubeRay operator installed and configured
  • kubectl configured to access the cluster
  • Hugging Face access token with read permissions for stabilityai/stable-diffusion-3-medium-diffusers

Step 1: Create the Hugging Face token secret

kubectl create secret generic hf-secret --from-literal=token=<YOUR_HF_TOKEN>

Step 2: Deploy

kubectl apply -f https://gist.github.com/karthik-aion/5260d2409b6fad31af60f5f2c8ec1235/raw
This creates:
  • ConfigMap (stable-diffusion-gradio-code): contains the Stable Diffusion Gradio app code
  • RayService (stable-diffusion-gradio): Ray cluster with a head node and GPU worker(s), running the app via Ray Serve

Step 3: Check status

Wait for the RayService to become ready:
kubectl get rayservice stable-diffusion-gradio
Check pod status:
kubectl get pods -l ray.io/cluster=$(kubectl get rayservice stable-diffusion-gradio -o jsonpath='{.status.activeServiceStatus.rayClusterName}')

Step 4: Access the Gradio UI

The service is exposed via NodePort on port 31770:
http://<cluster-id>.groundcontrol-aion.xyz:31770
To find the node port, if needed:
kubectl get svc stable-diffusion-gradio-serve-svc -o jsonpath='{.spec.ports[0].nodePort}'
To find the cluster ID: “Screenshot Placeholder” Alternatively, port-forward to access locally:
kubectl port-forward svc/stable-diffusion-gradio-serve-svc 8000:8000
Then open http://localhost:8000.

Cleanup

kubectl delete -f https://gist.github.com/karthik-aion/5260d2409b6fad31af60f5f2c8ec1235/raw
kubectl delete secret hf-secret