Five endpoints, Bearer auth, structured JSON. Cloudflare edge served. Read the spec and try every endpoint live below — your key never leaves the browser.
◆ 1 / Authentication
Bearer keys
Every request must carry an API key in the Authorization header. Generate one at /account.
Authorization: Bearer alm_live_a1b2c3…
Keys are hashed at rest; if you lose one, generate a new one and revoke the old. Never commit a key to source control.
◆ 2 / Rate limits
Monthly request caps
Quotas are enforced per calendar month (UTC). Every response carries X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset headers so you can self-throttle.
Plan
Monthly cap
Note
developer
1,000 / month
$0. Generous free tier — every signup gets one.
builder
10,000 / month
$29/mo. The default for shipping products.
scale
100,000 / month
$199/mo. Production agent traffic at volume.
◆ 3 / Endpoints
Reference
Each endpoint includes a live Try it widget. Click it, paste your key (kept only in sessionStorage), and you'll see the real response.
GET/api/v1/reports
List reports with filters and pagination. Body content NOT included.
Almanex is the corpus, not the model. Plug in any inference provider —
Anthropic, Gemini, OpenAI, or any OpenAI-compatible endpoint
(Together,
Groq,
OpenRouter,
vLLM,
Ollama).
Set one BYOK header to override the managed provider for a single request, or set several to enable
cascading fallback (anthropic → gemini → openai → openai-compat).
Keys are read from request headers and forwarded verbatim to the upstream provider. They are
never persisted, never logged in full, and never stored anywhere in our infra.
BYOK shifts the inference bill off Almanex — _meta.inference_cost_cents
comes back as null on those calls.
Header
Shape
Purpose
X-Anthropic-Key
sk-ant-…
Anthropic Messages API key.
X-Gemini-Key
AIza<35 chars>
Google Generative Language (v1beta) API key.
X-OpenAI-Key
sk-… / sk-proj-…
OpenAI Chat Completions API key.
X-Model-Provider
anthropic | gemini | openai | openai-compat
Force a single provider for this request — disables cascading.
X-Model-Endpoint
https://…
Required for openai-compat. Base URL of the chat-completions service.
X-Model-Key
<any>
Required for openai-compat. Auth token for the custom endpoint.
X-Model-Name
<any>
Override the default model id (e.g. claude-haiku-4-5-20251001, gemini-2.5-flash, qwen2.5-72b).
◆ openai-compat protocol
Anything that speaks the OpenAI Chat Completions wire format works. The agent posts to
{X-Model-Endpoint}/chat/completions
(or to the endpoint directly if it already includes the path) with
Authorization: Bearer {X-Model-Key}.
Tool calls use the standard tool_calls[] shape.
# Run the comparison agent against a Together-hosted Llama
curl https://almanex.ai/api/agent/compare \
-H "Authorization: Bearer alm_live_…" \
-H "X-Model-Endpoint: https://api.together.xyz/v1" \
-H "X-Model-Key: $TOGETHER_API_KEY" \
-H "X-Model-Name: meta-llama/Llama-3.3-70B-Instruct-Turbo" \
-H "Content-Type: application/json" \
-d '{"tickers":["NVDA","AMD"],"dimension":"growth"}'
Cost reporting falls back to $0.00 for unknown models —
you own the bill on your provider's side.
◆ Cascading fallback
When more than one BYOK key is set (and X-Model-Provider is not),
the agent tries each provider in order
(anthropic → gemini → openai → openai-compat) and falls forward on
5xx, 429, or network error. It does not fall through on a
4xx auth failure or a content-policy refusal — those mean the same prompt would fail elsewhere.
Every attempt (success or failure) shows up in _meta.providers_tried.
◆ 5 / Errors
Structured error responses
All errors are returned as JSON with a stable error field your code can switch on.