API 文档

TokenGateway 提供统一的 AI 模型代理中转服务，兼容 OpenAI 标准 API 格式。平台自动选择最佳可用模型，支持流式和非流式响应，提供完善的错误处理和限流机制。

100%

OpenAI 兼容

智能路由

自动选择最优模型

高可用

多上游自动容错

快速开始

只需三步，即可开始使用：

注册账户并获取 API Key
选择接口端点和请求参数
发送请求获取响应

POST https://www.xlei.site/xlei/v1/chat/completions

curl -X POST https://www.xlei.site/xlei/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-API-Key: sk-your-api-key" \
  -d '{
    "messages": [
      {"role": "user", "content": "Hello"}
    ]
  }'

接口端点

系统提供两个主要端点，后端自动判断请求模式：

POST /xlei/v1/chat/completions

用于聊天模式，使用 messages 参数，推荐用于交互式对话场景。

POST /xlei/v1/completions

用于补全模式，使用 prompt 参数，适用于文本续写场景。

模式自动识别

系统会根据请求参数自动判断模式：

聊天模式: 当提供 messages 参数时
补全模式: 当只提供 prompt 参数时
优先级: messages 参数优先级更高

认证方式

支持两种认证方式，优先使用 X-API-Key：

# 方式1：X-API-Key（推荐）
curl -X POST https://www.xlei.site/xlei/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-API-Key: sk-your-api-key" \
  -d '{"messages": [{"role": "user", "content": "Hello"}]}'

# 方式2：Authorization Bearer
curl -X POST https://www.xlei.site/xlei/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-your-api-key" \
  -d '{"messages": [{"role": "user", "content": "Hello"}]}'

获取 API Key

登录用户门户后，在「API 密钥」页面创建和管理您的 API Key。

限流机制

平台采用双重限流策略，保障服务稳定：

IP限流

500次/分钟

单个IP的请求频率限制

API Key限流

100次/分钟

单个API Key的请求频率限制

超出限流阈值将返回 429 Too Many Requests 错误

请求参数

只需提供以下任一参数，其他参数由后端自动设置最优值：

参数	类型	必填	默认值	说明
messages	array	二选一	-	聊天消息数组，与prompt二选一（推荐）
prompt	string	二选一	-	补全文本，与messages二选一
model	string	否	auto	模型名称，auto表示自动选择
max_tokens	int	否	1000	最大响应Token数
temperature	float	否	0.7	温度系数（0-2），越高越随机
stream	bool	否	true	是否启用流式响应

messages数组格式

"messages": [
  {"role": "system", "content": "你是一个乐于助人的助手"},
  {"role": "user", "content": "Hello!"},
  {"role": "assistant", "content": "Hello! How can I help you?"},
  {"role": "user", "content": "Tell me more"}
]

role可选值: system（系统提示，定义助手行为）、 user（用户输入）、 assistant（助手回复，用于多轮对话）

请求示例

聊天模式（推荐）

{
  "model": "auto",
  "messages": [
    {"role": "user", "content": "Hello!"}
  ],
  "stream": true
}

多轮对话

{
  "messages": [
    {"role": "user", "content": "What is AI?"},
    {"role": "assistant", "content": "AI stands for Artificial Intelligence..."},
    {"role": "user", "content": "Tell me more about it."}
  ]
}

补全模式

{
  "prompt": "Once upon a time in a faraway land,",
  "max_tokens": 200,
  "temperature": 0.9
}

响应格式

流式响应（默认）

返回 SSE（Server-Sent Events）格式，适合实时显示：

data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","created":1714166400,"choices":[{"delta":{"content":"Hello"}}]}
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","created":1714166400,"choices":[{"delta":{"content":"!"}}]}
data: [DONE]

非流式响应

设置 stream: false 时返回完整JSON：

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1714166400,
  "model": "deepseek-v4-flash",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 15,
    "completion_tokens": 20,
    "total_tokens": 35
  }
}

推理过程（可选）

部分模型支持返回推理过程：

{
  "id": "chatcmpl-xxx",
  "object": "chat.completion",
  "model": "deepseek-v4-flash",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "答案是42",
        "reasoning": "首先分析问题...然后计算..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 50,
    "completion_tokens": 30,
    "total_tokens": 80
  }
}

错误码

错误码	HTTP状态	说明
0	200	成功
1001	400	缺少prompt或messages参数
1002	400	请求格式错误
1003	400	模型类型不匹配
2001	401	API Key无效或已禁用
2002	402	余额不足
2003	402	订阅日限额已达
2004	429	请求过于频繁（限流）
4001	503	无可用上游模型
5000	500	服务器内部错误

错误响应格式

{
  "code": 2002,
  "message": "余额不足，请充值后重试",
  "data": null
}

代码示例

curl

cURL

curl -X POST https://www.xlei.site/xlei/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-API-Key: sk-your-api-key" \
  -d '{
    "model": "auto",
    "messages": [{"role": "user", "content": "Hello"}]
  }' \
  --no-buffer

Python

import requests
import os

base_url = "https://www.xlei.site"
api_key = os.getenv("XLEI_API_KEY")

headers = {
    "Content-Type": "application/json",
    "X-API-Key": api_key
}

data = {
    "model": "auto",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": True
}

response = requests.post(
    f"{base_url}/xlei/v1/chat/completions",
    json=data,
    headers=headers,
    stream=True
)

for line in response.iter_lines():
    if line:
        decoded_line = line.decode('utf-8')
        if decoded_line.startswith('data: '):
            print(decoded_line[6:])

JavaScript

const baseUrl = 'https://www.xlei.site';
const apiKey = process.env.XLEI_API_KEY;

async function chat(message) {
    const response = await fetch(`${baseUrl}/xlei/v1/chat/completions`, {
        method: 'POST',
        headers: {
            'Content-Type': 'application/json',
            'X-API-Key': apiKey
        },
        body: JSON.stringify({
            model: 'auto',
            messages: [{ role: 'user', content: message }],
            stream: true
        })
    });

    const reader = response.body.getReader();
    const decoder = new TextDecoder('utf-8');
    
    while (true) {
        const { value, done } = await reader.read();
        if (done) break;
        
        const lines = decoder.decode(value).split('\n');
        for (const line of lines) {
            if (line.startsWith('data: ')) {
                const data = JSON.parse(line.slice(6));
                const content = data.choices[0]?.delta?.content;
                if (content) process.stdout.write(content);
            }
        }
    }
}

chat('Hello!');

第三方工具接入

TokenGateway 可以无缝接入支持 OpenAI 兼容接口的第三方工具：

配置示例

在支持自定义 API 的工具中配置以下参数：

API Base URL: https://www.xlei.site/xlei/v1
API Key: sk-your-api-key
Model: auto
Stream: true

支持的工具

LangChain / LlamaIndex: AI 应用开发框架
Chatbox / LobeChat: 桌面/移动端 AI 客户端
Cursor / CodeLlama: 代码编辑器插件
OpenAI 兼容 SDK: 各种语言的 OpenAI SDK

LangChain 示例

from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(
    model_name="auto",
    openai_api_base="https://www.xlei.site/xlei/v1",
    openai_api_key="sk-your-api-key",
    streaming=True,
    temperature=0.7
)

response = llm.predict("Hello!")
print(response)