Generative LLM
API Reference
Generative LLM
OpenAI-compatible chat completions, /responses, and /models
POST
Generative LLM
Classer exposes an OpenAI-compatible generation surface so existing tools and SDKs work unchanged. Point any client atDocumentation Index
Fetch the complete documentation index at: https://docs.classer.ai/llms.txt
Use this file to discover all available pages before exploring further.
https://api.classer.ai/v1 with your
Classer API key and call chat/completions, responses, or models exactly
as you would against OpenAI.
Drop-in replacement for OpenAI clients. Set
base_url to
https://api.classer.ai/v1 and api_key to your Classer key — no other code
changes needed.Get an API key
Create one at classer.ai/api-keys. You can configure per-key logging (off by default for inputs and outputs) and an optional retention window. The key value is shown once at creation — copy it immediately.Available models
| Model | Context | Input ($/1M) | Cache read ($/1M) | Output ($/1M) | Notes |
|---|---|---|---|---|---|
deepseek-ai/DeepSeek-V4-Flash | 1M | $0.10 | – | $0.28 | Open-weights, cost-sensitive workloads |
Qwen/Qwen3.6-35B-A3B | 256K | $0.15 | $0.05 | $1.00 | Vision, tool use, structured outputs |
moonshotai/Kimi-K2.6 | 256K | $0.70 | – | $4.00 | Reasoning, vision, tool use |
/v1/chat/completions and /v1/responses. Use
GET /v1/models for the live machine-readable catalog
including per-model supported_features, supported_sampling_parameters,
quantization, and modalities.
Endpoints
| Method | Endpoint | Description |
|---|---|---|
| POST | /v1/chat/completions | OpenAI Chat Completions API |
| POST | /v1/responses | OpenAI Responses API |
| GET | /v1/models | List available generative models |
Chat completions
POST /v1/chat/completions accepts the OpenAI Chat Completions request shape verbatim and returns the upstream response unchanged. Both streaming (stream: true) and non-streaming requests are supported. Usage tokens are populated in the response for both modes — the proxy automatically negotiates stream_options.include_usage where needed so billing is exact.
Responses
POST /v1/responses is the OpenAI Responses API. Same models, same pricing as chat completions; the request and response shapes match OpenAI’s spec.
List models
GET /v1/models returns the live deployment list with full OpenRouter-style
metadata: pricing, context length, max output length, quantization,
input/output modalities, supported sampling parameters, and supported
features per model.
Errors
| Status | Code | Cause |
|---|---|---|
| 401 | INVALID_API_KEY | Missing or invalid Authorization: Bearer … header |
| 400 | invalid_request_error | model not in the allow-list (response includes the list of valid models) |
| 400 | invalid_request_error | model field missing entirely |
| 402 | INSUFFICIENT_BALANCE | Account balance is below the grace buffer; top up to resume |
| 503 | – | Generate gateway not configured / unreachable |