The big picture
Run a model locally → expose it to the web → register the endpoint on Token Router → earn 70% of every request it serves.
Got a GPU that’s idle more than it’s busy? Token Router lets it earn its keep. You run a model on your own hardware, expose it as an OpenAI-compatible endpoint, and register it with us. When the gateway routes traffic your way, you get paid.
The big picture
Run a model locally → expose it to the web → register the endpoint on Token Router → earn 70% of every request it serves.
Stand up an inference server. Pick your platform and follow the in-depth guide. Each one ends with a live, authenticated, publicly reachable endpoint.
Register the instance. In the dashboard, open Instances → Add instance. Give us the model name, your public endpoint URL, and the upstream API key it expects. We encrypt that key at rest — it’s never stored in the clear.
Describe your hardware (optional but smart). Each instance has a hardware and software inventory. Logging your GPU and engine version helps the network understand your capacity.
Go live. Once active, your instance enters rotation. The gateway sends it traffic whenever it’s the least-loaded healthy node for a model someone’s calling.
We’ve designed routing to be forgiving, because home setups aren’t data centers: