DeveloperAPI

Chat Completions

Cube AI supports OpenAI-compatible Chat Completions for conversational LLM usage.

All requests are authenticated using Personal Access Tokens (PATs) and executed inside Trusted Execution Environments (TEEs).

Endpoint

POST /proxy/{domain_id}/v1/chat/completions

Example Request

curl -k https://localhost/proxy/<domain_id>/v1/chat/completions \
  -H "Authorization: Bearer <pat>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tinyllama:1.1b",
    "messages": [
      { "role": "user", "content": "Hello from Cube AI" }
    ]
  }'

Example Response

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1766137794,
  "model": "tinyllama:1.1b",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello from Cube AI!"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 42,
    "completion_tokens": 46,
    "total_tokens": 88
  }
}

Notes

Requests are authenticated using Personal Access Tokens (PATs)
All requests are executed inside a Trusted Execution Environment (TEE)
Requests are fully domain-isolated
The API follows the OpenAI Chat Completions schema
Feature support depends on the selected backend and model

Models

Previous Page

Completions

Next Page

On this page

Endpoint Example Request Example Response Notes