Platform
Solutions
Industries
Company
Docs
OpenGPU ❤️ RELAY

One HTTPS endpoint fordecentralized GPU power

Relay gives you a simple interface to run AI workloads on the OpenGPU network. You send a request over HTTPS. Relay finds the right GPUs, runs the job, and returns the result.

What Relay does

Relay is the routing layer that sits between your application and the OpenGPU network. Instead of managing clusters, scheduling, and GPU provisioning, you call a single HTTPS endpoint. Relay takes care of provider selection, routing, retries, and failover in the background. You keep your existing stack. Relay slots in as a simple API that your backends, agents, and internal tools can call whenever they need GPU power.

What Relay does

Enterprise billing that works like any cloud

Relay includes a full fiat billing system so teams can use decentralized GPUs the same way they use AWS, Google Cloud, or Azure. No tokens, no crypto wallets, no blockchain steps required. Monthly invoices and standard payment methods. Cost visibility and predictable billing. No idle GPU waste you only pay when jobs run. 60%–80% cheaper than centralized clouds due to zero overhead. This makes Relay drop-in compatible with procurement, finance teams, and enterprise workflows.

Enterprise billing

How Relay is designed

Relay focuses on three things that matter in production environments. A simple interface, smart routing, and reliability by default.

How Relay is designed
One HTTPS endpoint

One HTTPS endpoint

Send AI workloads through a single API. Works with any backend or framework.

INTERFACE
Global GPU mesh

Global GPU mesh

Relay selects the best providers based on performance, memory, and live health.

ROUTING
Automatic failover

Automatic failover

Relay retries or re-routes automatically if a provider slows or fails.

RELIABILITY

What you can run through Relay

Relay moves real AI workloads across the OpenGPU network, not benchmarks or demos.

What you can run through Relay
Section background

LLM inference

Low-latency inference for agents, chatbots, tools, and real products.

Image and video generation

Diffusion and rendering workloads managed automatically across providers.

Batch and offline jobs

Embeddings, indexing, and long-running tasks routed efficiently.

Agents and automation

Relay keeps agent systems running even when providers churn.

How Relay works in practice

Relay continuously evaluates and routes jobs behind the scenes

  • You send a request. Relay receives model + inputs + params.
  • Relay evaluates the job. It checks memory, type, duration.
  • Matched to providers. Relay selects one or more GPU providers.
  • Execution + monitoring. Relay tracks progress and timeouts.
  • Results returned. You get output + run metadata.
How Relay works in practice

OpenGPU Relay Pricing

Relay exposes the network through a simple HTTPS endpoint with fiat billing. Pricing is already more than 50 percent cheaper than centralized services on average and many workloads land in the 60-80 percent savings band compared to major clouds.

OpenGPU Relay Pricing

Ready to try Relay

Connect your stack through a single HTTPS endpoint and scale at your own pace.

Ready to try Relay
OpenGPU Network
Benchmark OpenGPU against
any cloud.
Measure inference or training workloads on distributed GPUs
with instant elasticity and real world performance.
OpenGPU Logo
OpenGPU Logo
© Copyright 2026, OpenGPU Network. All Rights Reserved.