Skip to content

Developer API

Token Router exposes two API surfaces from one base host (https://beta.token-router.org):

  • /v1/* — the OpenAI-compatible inference API. Use it with any OpenAI SDK.
  • /api/* — the management API for your account, keys, and provider instances.

Inference and utility endpoints authenticate with a bearer token:

Authorization: Bearer vk_live_<prefix>_<secret>

Mint keys in the dashboard under Keys. The full secret is shown once at creation — store it somewhere safe. Dashboard/management routes that act on your account are protected by your signed-in session instead.


Lists the models the network is actively serving, in OpenAI’s model-list format.

Terminal window
curl https://beta.token-router.org/v1/models \
-H "Authorization: Bearer vk_live_your_key_here"

OpenAI-compatible chat completions. Pass a model id from /v1/models and the usual messages array; standard parameters like temperature, max_tokens, and stream are supported.

Terminal window
curl https://beta.token-router.org/v1/chat/completions \
-H "Authorization: Bearer vk_live_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3.6-27b",
"messages": [
{"role": "system", "content": "You are concise."},
{"role": "user", "content": "What is Token Router?"}
],
"temperature": 0.7
}'

These power the dashboard. Account and resource routes use your signed-in browser session; a couple of utility routes accept your bearer token.

Method & pathWhat it does
GET /api/whoamiConfirms your bearer key resolves at the edge.
GET /api/jobs/:idPolls an async chat completion queued from /v1/chat/completions.
Method & pathWhat it does
GET /api/meYour account and balance.
GET /api/keysList your API keys.
POST /api/keysMint a new vk_live_… key.
DELETE /api/keys/:idRevoke a key.
Method & pathWhat it does
GET /api/instancesList your registered instances.
POST /api/instancesRegister a new model-serving instance.
GET /api/instances/:idFetch one instance.
PATCH /api/instances/:idUpdate an instance.
DELETE /api/instances/:idRemove an instance.

Each instance also has a hardware and software inventory:

Method & pathWhat it does
GET /api/instances/:id/processing-unitsList hardware (GPUs, etc.).
POST /api/instances/:id/processing-unitsAdd a processing unit.
PATCH /api/instances/:id/processing-units/:puIdUpdate one.
DELETE /api/instances/:id/processing-units/:puIdRemove one.
GET /api/instances/:id/softwareList software/engine versions.
POST /api/instances/:id/softwareAdd a software entry.
PATCH /api/instances/:id/software/:swIdUpdate one.
DELETE /api/instances/:id/software/:swIdRemove one.

Method & pathWhat it does
GET /Health check.
GET /auth/github/loginStart GitHub sign-in.
GET /auth/github/callbackOAuth callback.
POST /auth/logoutEnd your session.

  • Base URL: always include /v1 for inference (https://beta.token-router.org/v1).
  • Model availability is live. The catalog reflects what providers are serving right now; check /v1/models rather than hard-coding a model that may go offline.
  • Errors follow OpenAI conventions. A 4xx usually means something about your request (bad model id, malformed body, insufficient credits); 5xx and timeouts are handled with automatic retries across providers before they reach you.
  • Want to serve models instead of just calling them? See Contribute Compute.