OpenGPU ❤️ RELAY

One HTTPS endpoint fordecentralized GPU power

Relay gives you a simple interface to run AI workloads on the OpenGPU network. You send a request over HTTPS. Relay finds the right GPUs, runs the job, and returns the result.

What Relay does

Relay is the routing layer that sits between your application and the OpenGPU network. Instead of managing clusters, scheduling, and GPU provisioning, you call a single HTTPS endpoint. Relay takes care of provider selection, routing, retries, and failover in the background. You keep your existing stack. Relay slots in as a simple API that your backends, agents, and internal tools can call whenever they need GPU power.

Enterprise billing that works like any cloud

Relay includes a full fiat billing system so teams can use decentralized GPUs the same way they use AWS, Google Cloud, or Azure. No tokens, no crypto wallets, no blockchain steps required. Monthly invoices and standard payment methods. Cost visibility and predictable billing. No idle GPU waste you only pay when jobs run. 60%–80% cheaper than centralized clouds due to zero overhead. This makes Relay drop-in compatible with procurement, finance teams, and enterprise workflows.

How Relay is designed

Relay focuses on three things that matter in production environments. A simple interface, smart routing, and reliability by default.

One HTTPS endpoint

Send AI workloads through a single API. Works with any backend or framework.

INTERFACE

Global GPU mesh

Relay selects the best providers based on performance, memory, and live health.

ROUTING

Automatic failover

Relay retries or re-routes automatically if a provider slows or fails.

RELIABILITY

What you can run through Relay

Relay moves real AI workloads across the OpenGPU network, not benchmarks or demos.

LLM inference

Low-latency inference for agents, chatbots, tools, and real products.

Image and video generation

Diffusion and rendering workloads managed automatically across providers.

Batch and offline jobs

Embeddings, indexing, and long-running tasks routed efficiently.

Agents and automation

Relay keeps agent systems running even when providers churn.

How Relay works in practice

Relay continuously evaluates and routes jobs behind the scenes

You send a request. Relay receives model + inputs + params.
Relay evaluates the job. It checks memory, type, duration.
Matched to providers. Relay selects one or more GPU providers.
Execution + monitoring. Relay tracks progress and timeouts.
Results returned. You get output + run metadata.

OpenGPU Relay Pricing

Relay exposes the network through a simple HTTPS endpoint with fiat billing. Pricing is already more than 50 percent cheaper than centralized services on average and many workloads land in the 60-80 percent savings band compared to major clouds.