SyzeAI API Documentation

OpenAI-compatible API gateway routing to 300+ models. Drop-in replacement for OpenAI's SDK — change the base_url and you're done.

Introduction

SyzeAI is an OpenAI-compatible HTTP API. Any client library that targets OpenAI (Python, Node, Go, etc.) works by changing two things:

base_url → https://llm.g4rrzx.my.id/v1
api_key → your sk-syze-... key

The router handles upstream routing, load-balancing, retries, and failover transparently.

Authentication

All requests require a Bearer token in the Authorization header. Get your API key from the dashboard.

Authorization: Bearer sk-syze-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

🔒 Keep your API key secret. Never commit it to git or expose it in client-side code.

Quickstart

Run your first request in 30 seconds:

cURL

Python

Node.js

curl https://llm.g4rrzx.my.id/v1/chat/completions \
  -H "Authorization: Bearer $SYZE_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-haiku-4-5-20251001",
    "messages": [{"role": "user", "content": "Say hi in one word"}]
  }'

from openai import OpenAI

client = OpenAI(
    api_key="sk-syze-...",
    base_url="https://llm.g4rrzx.my.id/v1",
)

response = client.chat.completions.create(
    model="anthropic/claude-haiku-4-5-20251001",
    messages=[{"role": "user", "content": "Say hi"}],
)
print(response.choices[0].message.content)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "sk-syze-...",
  baseURL: "https://llm.g4rrzx.my.id/v1",
});

const response = await client.chat.completions.create({
  model: "anthropic/claude-haiku-4-5-20251001",
  messages: [{ role: "user", content: "Say hi" }],
});

console.log(response.choices[0].message.content);

Chat Completions

POST /v1/chat/completions

OpenAI-compatible chat completion endpoint. Supports streaming, tool calling, vision, and all standard parameters.

Request Body

Field	Type	Required	Description
`model`	string	yes	Model ID (e.g. `anthropic/claude-opus-4-7`)
`messages`	array	yes	Conversation messages
`stream`	boolean	no	Server-sent events streaming. Default `false`.
`temperature`	number	no	0-2. Default 1.
`max_tokens`	integer	no	Max output tokens
`top_p`	number	no	Nucleus sampling, 0-1
`tools`	array	no	Tool/function definitions
`tool_choice`	string\|obj	no	`auto`, `none`, or specific tool

Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1717603200,
  "model": "anthropic/claude-opus-4-7",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello!"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 8,
    "total_tokens": 20
  }
}

Legacy Completions

POST /v1/completions

Legacy text completion endpoint. Use /v1/chat/completions for modern models.

Embeddings

POST /v1/embeddings

Generate vector embeddings for text. OpenAI-compatible.

{
  "model": "text-embedding-3-small",
  "input": "Hello world"
}

List Models

GET /v1/models

Returns models available to your API key (auth required).

Available Models

Loading...

Streaming

Set "stream": true to receive Server-Sent Events. Each event is a JSON object with a delta chunk.

# Stream with curl --no-buffer
curl -N https://llm.g4rrzx.my.id/v1/chat/completions \
  -H "Authorization: Bearer $SYZE_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-haiku-4-5-20251001",
    "messages": [{"role": "user", "content": "Count to 5"}],
    "stream": true
  }'

Tool Calling

Pass tools array with OpenAI function-schema. The model can request to call tools, which you execute and return the result.

{
  "model": "anthropic/claude-sonnet-4-5-20250929",
  "messages": [{"role": "user", "content": "What's the weather in Jakarta?"}],
  "tools": [{
    "type": "function",
    "function": {
      "name": "get_weather",
      "parameters": {
        "type": "object",
        "properties": {"city": {"type": "string"}}
      }
    }
  }]
}

Vision (Multimodal)

Multimodal models accept image inputs via image_url content parts.

{
  "model": "anthropic/claude-opus-4-7",
  "messages": [{
    "role": "user",
    "content": [
      {"type": "text", "text": "Describe this image"},
      {"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}}
    ]
  }]
}

Error Codes

Code	Meaning	Action
`400`	Invalid request body	Check parameters
`401`	Invalid / missing API key	Verify Authorization header
`402`	Insufficient credit	Top up in dashboard
`403`	Model not allowed for your tier	Upgrade tier
`404`	Model ID not found	Check /v1/models
`429`	Rate limit (TPM exceeded)	Backoff or upgrade tier
`500`	Internal error	Retry with exponential backoff
`503`	All upstream keys exhausted	Retry in ~15s

Rate Limits

Rate limits are enforced as TPM (tokens-per-minute) per API key:

Free tier: 10K TPM
Pay-as-you-go: 200K TPM
Custom / BYOK: unlimited

✅ When you hit a limit you get 429 with Retry-After header. The router has built-in retry — most apps never see 429.

SDKs & Libraries

Any OpenAI SDK works. Popular options:

Python: openai, litellm, instructor
Node.js: openai, ai (Vercel), langchain
Go: github.com/sashabaranov/go-openai
Rust: async-openai
CLI: llm by Simon Willison (with llm-openai-plugin)

Need help? Found a bug?

Chat on Telegram