Contribute Compute

Got a GPU that’s idle more than it’s busy? Token Router lets it earn its keep. You run a model on your own hardware, expose it as an OpenAI-compatible endpoint, and register it with us. When the gateway routes traffic your way, you get paid.

The big picture

Run a model locally → expose it to the web → register the endpoint on Token Router → earn 70% of every request it serves.

What you need

A machine that can serve an LLM at 5 tokens/second or faster (our minimum quality bar). A consumer GPU or a recent Apple Silicon Mac is plenty.
An inference engine that exposes an OpenAI-compatible API — we have full guides for the three popular ones below.
A way to put that endpoint on the public internet. We recommend a free Cloudflare Tunnel (the OS guides walk you through it).
A Token Router account (sign in with GitHub).

The flow

Stand up an inference server. Pick your platform and follow the in-depth guide. Each one ends with a live, authenticated, publicly reachable endpoint.
Register the instance. In the dashboard, open Instances → Add instance. Give us the model name, your public endpoint URL, and the upstream API key it expects. We encrypt that key at rest — it’s never stored in the clear.
Describe your hardware (optional but smart). Each instance has a hardware and software inventory. Logging your GPU and engine version helps the network understand your capacity.
Go live. Once active, your instance enters rotation. The gateway sends it traffic whenever it’s the least-loaded healthy node for a model someone’s calling.

Pick your platform

macOS — oMLX Apple Silicon, unified memory, MLX-native speed.

Linux — vLLM NVIDIA GPUs, high throughput, the server workhorse.

Windows — llama.cpp NVIDIA GPUs on Windows 10/11, GGUF models.

What happens when things go wrong

We’ve designed routing to be forgiving, because home setups aren’t data centers:

Bad user input isn’t your fault. If a consumer sends a malformed request and your upstream returns a 4xx, that’s a neutral outcome — it doesn’t dent your reputation. Only timeouts, rate-limits, and 5xx errors count as failures.
Hiccups don’t park you for hours. Three failures in a minute takes you offline, but after just 15 seconds you get a probe request. Pass it and you’re fully back in rotation. A brief blip stays brief.

Next steps

See exactly how the money works in Earn Credits and The Fair Ecosystem.
Learn how the network builds trust in your node over time: Peer Validation.