kaman.ai

Docs

Documentation

Guides, use cases & API reference

  • Overview
    • Getting Started
    • Platform Overview
  • Features
    • Features Overview
    • AI Assistant
    • Workflow Automation
    • Intelligent Memory
    • Data Management
    • Universal Integrations
    • Communication Channels
    • Security & Control
  • Use Cases Overview
  • Financial Services
  • Fraud Detection
  • Supply Chain
  • Technical Support
  • Software Development
  • Smart ETL
  • Data Governance
  • ESG Reporting
  • TAC Management
  • Reference
    • API Reference
  • Guides
    • Getting Started
    • Authentication
  • Endpoints
    • Workflows API
    • Tools API
    • KDL (Data Lake) API
    • OpenAI-Compatible API
    • A2A Protocol
    • Skills API
    • Knowledge Base (RAG) API
    • Communication Channels

OpenAI-Compatible API

Kaman exposes an OpenAI-compatible REST API so you can interact with your agents using the standard OpenAI SDK, curl, or any tool that speaks the OpenAI chat-completions protocol.

Base URL

/api/v1

For self-hosted installations: http://kaman.ai/api/v1

Authentication

All requests require a Bearer token in the Authorization header:

Authorization: Bearer <your-kaman-token>

Endpoints

MethodPathDescription
POST/api/v1/chat/completionsChat completions (streaming & non-streaming)
GET/api/v1/modelsList available agents/models

Models & Routing

The model field in your request controls which Kaman expert is used and how it runs.

Model FormatModeBehavior
42_0LLMRoutes through expert's underlying LLM with system prompt
expert:42_0LLMSame as above (explicit)
agent:42_0AgentFull Kaman agent pipeline — LangGraph, tools, RAG, memory

LLM mode injects the expert's system prompt and forwards to Model Proxy. Fast, no tool calling from the Kaman side.

Agent mode triggers the full Kaman agent pipeline: thinking, tool fetching, tool execution, RAG, memory, and suggestions. Supports multi-turn conversations, human-in-the-loop interrupts, and artifact generation.


List Models

Returns all experts available to the authenticated user, each in two variants (LLM mode and Agent mode).

Request

bash
curl http://kaman.ai/api/v1/models \
  -H "Authorization: Bearer $KAMAN_TOKEN"

Response

json
{
  "object": "list",
  "data": [
    {
      "id": "42_0",
      "object": "model",
      "created": 1708881234,
      "owned_by": "kaman",
      "name": "Sales Assistant",
      "underlying_model": "gpt-4",
      "description": "Handles sales inquiries and CRM operations"
    },
    {
      "id": "agent:42_0",
      "object": "model",
      "created": 1708881234,
      "owned_by": "kaman",
      "name": "Sales Assistant (Agent)",
      "underlying_model": "gpt-4",
      "description": "Full agent mode with tools and memory"
    }
  ]
}

Chat Completions

Request

POST /api/v1/chat/completions
json
{
  "model": "agent:42_0",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What were last quarter's sales?"}
  ],
  "stream": false,
  "temperature": 0.7,
  "max_tokens": 1024,
  "tools": [],
  "tool_choice": "auto"
}

Request Fields

FieldTypeDefaultDescription
modelstringrequiredExpert ID with optional prefix (see routing table)
messagesarrayrequiredChat history in OpenAI message format
streambooleanfalseEnable Server-Sent Events streaming
temperaturenumber0.7Sampling temperature (0–2)
max_tokensnumber—Maximum tokens in the response
top_pnumber—Nucleus sampling
frequency_penaltynumber—Frequency penalty (−2 to 2)
presence_penaltynumber—Presence penalty (−2 to 2)
stopstring[]—Stop sequences
toolsarray—OpenAI function definitions (LLM mode only)
tool_choicestring/object"auto"Tool choice strategy

Non-Streaming Response

json
{
  "id": "chatcmpl-abc123def456",
  "object": "chat.completion",
  "created": 1708881234,
  "model": "agent:42_0",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Last quarter's total sales were $2.4M, up 12% from Q2."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  }
}

Streaming Response

When stream: true, the response is a stream of Server-Sent Events:

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1708881234,"model":"agent:42_0","choices":[{"index":0,"delta":{"content":"Last "},"finish_reason":null}]} data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1708881234,"model":"agent:42_0","choices":[{"index":0,"delta":{"content":"quarter's "},"finish_reason":null}]} data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1708881234,"model":"agent:42_0","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]} data: [DONE]

Tool Calling

In Agent mode, the Kaman agent handles tool calling internally — it discovers, selects, and executes tools autonomously. Tool call events are streamed back in OpenAI format so you can observe them.

In LLM mode, you can pass OpenAI-format tools and tool_choice to have the underlying LLM generate tool calls, just like the standard OpenAI API.

Tool Call in Response

json
{
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "getQuarterlySales",
              "arguments": "{\"quarter\": \"Q3\", \"year\": 2024}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ]
}

Kaman Extensions

Responses may include additional fields that standard OpenAI SDKs safely ignore. Kaman-aware clients can use these for richer UX.

FieldTypeDescription
kaman_artifactsarrayFiles or content artifacts generated by the agent
kaman_interruptarrayHuman-in-the-loop interrupts requiring user input
kaman_suggestionsarraySuggested follow-up prompts
kaman_thoughtstringAgent's internal reasoning (extended thinking)

Artifact Example

json
{
  "kaman_artifacts": [
    {
      "id": "artifact_1a2b3c",
      "name": "quarterly_report.xlsx",
      "type": "file",
      "url": "/api/artifacts/artifact_1a2b3c"
    }
  ]
}

Interrupt Example (Human-in-the-Loop)

When the agent needs user confirmation or input:

json
{
  "kaman_interrupt": [
    {
      "type": "confirmation",
      "message": "Send the report to finance@company.com?",
      "options": [
        {"value": "yes", "label": "Yes, send it"},
        {"value": "no", "label": "Cancel"}
      ],
      "toolName": "sendEmail",
      "toolCallId": "call_xyz789",
      "resumable": true
    }
  ]
}

Error Handling

Errors follow the OpenAI error format:

json
{
  "error": {
    "message": "Invalid authentication token",
    "type": "authentication_error",
    "code": "invalid_api_key"
  }
}
HTTP StatusError TypeDescription
401authentication_errorInvalid or missing token
400invalid_request_errorMalformed request body
404not_found_errorExpert/model not found
500api_errorInternal server error

Code Examples

Python (OpenAI SDK)

python
from openai import OpenAI

client = OpenAI(
    base_url="http://kaman.ai/api/v1",
    api_key="your-kaman-token",
)

# Non-streaming
response = client.chat.completions.create(
    model="agent:42_0",
    messages=[
        {"role": "user", "content": "Summarize last month's revenue"}
    ],
)
print(response.choices[0].message.content)

# Streaming
stream = client.chat.completions.create(
    model="agent:42_0",
    messages=[
        {"role": "user", "content": "Summarize last month's revenue"}
    ],
    stream=True,
)
for chunk in stream:
    delta = chunk.choices[0].delta
    if delta.content:
        print(delta.content, end="", flush=True)

TypeScript (OpenAI SDK)

typescript
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "http://kaman.ai/api/v1",
  apiKey: "your-kaman-token",
});

// Non-streaming
const response = await client.chat.completions.create({
  model: "agent:42_0",
  messages: [{ role: "user", content: "What are today's open tickets?" }],
});
console.log(response.choices[0].message.content);

// Streaming
const stream = await client.chat.completions.create({
  model: "agent:42_0",
  messages: [{ role: "user", content: "What are today's open tickets?" }],
  stream: true,
});
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || "");
}

cURL

bash
# Non-streaming
curl -X POST http://kaman.ai/api/v1/chat/completions \
  -H "Authorization: Bearer $KAMAN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "agent:42_0",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

# Streaming
curl -N -X POST http://kaman.ai/api/v1/chat/completions \
  -H "Authorization: Bearer $KAMAN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "agent:42_0",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": true
  }'

LLM Mode (Direct Expert)

python
# Use the expert's LLM directly (no agent pipeline)
response = client.chat.completions.create(
    model="42_0",  # or "expert:42_0"
    messages=[
        {"role": "user", "content": "Translate this to French: Hello world"}
    ],
    temperature=0.3,
)

Multi-Turn Conversations

The API maintains session state in Agent mode. Pass the full conversation history:

python
messages = [
    {"role": "user", "content": "Find all overdue invoices"},
    {"role": "assistant", "content": "I found 12 overdue invoices totaling $45,000."},
    {"role": "user", "content": "Send reminders to the top 5 by amount"},
]

response = client.chat.completions.create(
    model="agent:42_0",
    messages=messages,
)

Timeouts & Limits

SettingValue
Request timeout5 minutes
Max duration (edge function)300 seconds
CORSOpen (*)

Agent mode requests may take longer due to tool execution. Use streaming for real-time progress.

Next Steps

  • A2A Protocol — Agent-to-Agent communication protocol
  • Authentication — API authentication guide
  • Tools API — Search and execute individual tools

On this page

  • Base URL
  • Authentication
  • Endpoints
  • Models & Routing
  • List Models
  • Request
  • Response
  • Chat Completions
  • Request
  • Request Fields
  • Non-Streaming Response
  • Streaming Response
  • Tool Calling
  • Tool Call in Response
  • Kaman Extensions
  • Artifact Example
  • Interrupt Example (Human-in-the-Loop)
  • Error Handling
  • Code Examples
  • Python (OpenAI SDK)
  • TypeScript (OpenAI SDK)
  • cURL
  • LLM Mode (Direct Expert)
  • Multi-Turn Conversations
  • Timeouts & Limits
  • Next Steps