Developer API

Token Router exposes two API surfaces from one base host (https://beta.token-router.org):

/v1/* — the OpenAI-compatible inference API. Use it with any OpenAI SDK.
/api/* — the management API for your account, keys, and provider instances.

Authentication

Inference and utility endpoints authenticate with a bearer token:

Authorization: Bearer vk_live_<prefix>_<secret>

Mint keys in the dashboard under Keys. The full secret is shown once at creation — store it somewhere safe. Dashboard/management routes that act on your account are protected by your signed-in session instead.

Inference API (`/v1`)

`GET /v1/models`

Lists the models the network is actively serving, in OpenAI’s model-list format.

curl https://beta.token-router.org/v1/models \
  -H "Authorization: Bearer vk_live_your_key_here"

`POST /v1/chat/completions`

OpenAI-compatible chat completions. Pass a model id from /v1/models and the usual messages array; standard parameters like temperature, max_tokens, and stream are supported.

curl https://beta.token-router.org/v1/chat/completions \
  -H "Authorization: Bearer vk_live_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3.6-27b",
    "messages": [
      {"role": "system", "content": "You are concise."},
      {"role": "user", "content": "What is Token Router?"}
    ],
    "temperature": 0.7
  }'

from openai import OpenAI

client = OpenAI(
    base_url="https://beta.token-router.org/v1",
    api_key="vk_live_your_key_here",
)

resp = client.chat.completions.create(
    model="qwen3.6-27b",
    messages=[
        {"role": "system", "content": "You are concise."},
        {"role": "user", "content": "What is Token Router?"},
    ],
)
print(resp.choices[0].message.content)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://beta.token-router.org/v1",
  apiKey: "vk_live_your_key_here",
});

const resp = await client.chat.completions.create({
  model: "qwen3.6-27b",
  messages: [
    { role: "system", content: "You are concise." },
    { role: "user", content: "What is Token Router?" },
  ],
});
console.log(resp.choices[0].message.content);

Management API (`/api`)

These power the dashboard. Account and resource routes use your signed-in browser session; a couple of utility routes accept your bearer token.

Utility (bearer-protected)

Method & path	What it does
`GET /api/whoami`	Confirms your bearer key resolves at the edge.
`GET /api/jobs/:id`	Polls an async chat completion queued from `/v1/chat/completions`.

Account & keys (session-protected)

Method & path	What it does
`GET /api/me`	Your account and balance.
`GET /api/keys`	List your API keys.
`POST /api/keys`	Mint a new `vk_live_…` key.
`DELETE /api/keys/:id`	Revoke a key.

Provider instances (session-protected)

Method & path	What it does
`GET /api/instances`	List your registered instances.
`POST /api/instances`	Register a new model-serving instance.
`GET /api/instances/:id`	Fetch one instance.
`PATCH /api/instances/:id`	Update an instance.
`DELETE /api/instances/:id`	Remove an instance.

Each instance also has a hardware and software inventory:

Method & path	What it does
`GET /api/instances/:id/processing-units`	List hardware (GPUs, etc.).
`POST /api/instances/:id/processing-units`	Add a processing unit.
`PATCH /api/instances/:id/processing-units/:puId`	Update one.
`DELETE /api/instances/:id/processing-units/:puId`	Remove one.
`GET /api/instances/:id/software`	List software/engine versions.
`POST /api/instances/:id/software`	Add a software entry.
`PATCH /api/instances/:id/software/:swId`	Update one.
`DELETE /api/instances/:id/software/:swId`	Remove one.

Auth & OAuth endpoints

Method & path	What it does
`GET /`	Health check.
`GET /auth/github/login`	Start GitHub sign-in.
`GET /auth/github/callback`	OAuth callback.
`POST /auth/logout`	End your session.

Notes & gotchas

Base URL: always include /v1 for inference (https://beta.token-router.org/v1).
Model availability is live. The catalog reflects what providers are serving right now; check /v1/models rather than hard-coding a model that may go offline.
Errors follow OpenAI conventions. A 4xx usually means something about your request (bad model id, malformed body, insufficient credits); 5xx and timeouts are handled with automatic retries across providers before they reach you.
Want to serve models instead of just calling them? See Contribute Compute.