Agent-ready reference

Call Prometheus like OpenAI — stable aliases

Copy the base URL, send a prom_sk_ bearer key, choose a model alias, and use the standard OpenAI request shapes. This page is built for both engineers and agents.

minimum viable request

curl -sS https://api.getprometheus.org/v1/chat/completions \
  -H "Authorization: Bearer $PROMETHEUS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "prometheus-core",
    "messages": [
      { "role": "user", "content": "Give me a deployment checklist." }
    ],
    "temperature": 0.2,
    "max_tokens": 700
  }'

Base URL

api.getprometheus.org/v1

Auth

Bearer prom_sk_...

Default chat model

prometheus-core

Compatibility

OpenAI SDK + custom baseURL

Paste this into an agent

Agent spec

The compact contract an autonomous agent needs before it calls the API: routing rules, auth, endpoint names, and model selection.

agent-spec.yaml

# Prometheus Agent Spec
baseURL: https://api.getprometheus.org/v1
auth: Authorization: Bearer prom_sk_...

endpoints:
  GET  /v1/models
  POST /v1/chat/completions
  POST /v1/embeddings
  POST /v1/audio/transcriptions

models:
  prometheus-spark: chat; fast replies, routing, classification, short tasks; $0.15/1M in, $0.60/1M out
  prometheus-lite: chat; balanced cost, quality, and throughput; $0.10/1M in, $0.40/1M out
  prometheus-core: chat; complex reasoning, planning, code, multi-step agents; $1.10/1M in, $4.40/1M out
  prometheus-vision: chat; image understanding, screenshots, diagrams, homework photos; $0.40/1M in, $1.60/1M out
  prometheus-atlas: embeddings; RAG, memory, search, semantic matching; $0.02/1M in
  prometheus-echo: audio; speech-to-text transcription; $0.006/min

pricing:
  - USD, billed per actual usage; no minimum.
  - Chat and vision charge per input + output token.
  - Vision images count as input tokens; no separate per-image fee.
  - Embeddings charge per input token; audio charges per minute of source.

rules:
  - Use Prometheus aliases only; never send provider or upstream model ids.
  - OpenAI SDK clients must set baseURL to the /v1 URL above.
  - Send Authorization as a Bearer token with a prom_sk_ key.
  - Chat supports stream: true and returns OpenAI-style SSE chunks. Prefer stream: true for the lowest time-to-first-token.
  - Vision uses standard chat messages with image_url content parts.
  - Extra chat fields such as tools, response_format, top_p, and stream_options are forwarded.
  - Gateway errors use { error: { message, type, param, code } }.

Decision table

Model matrix

Prometheus exposes public aliases only. Clients should never send provider names or upstream model ids.

Alias	Endpoint	Capability	Use when
`prometheus-spark`	`/v1/chat/completions`	Fast chat	Routing, classification, short summaries, simple actions.
`prometheus-lite`	`/v1/chat/completions`	Balanced chat	High-volume assistants, production workflows, everyday agents.
`prometheus-core`	`/v1/chat/completions`	Deep reasoning	Planning, code, analysis, hard decisions, multi-step agents.
`prometheus-vision`	`/v1/chat/completions`	Vision chat	Homework photos, screenshots, diagrams, forms, visual QA.
`prometheus-atlas`	`/v1/embeddings`	Embeddings	RAG, memory, search, recommendations, semantic matching.
`prometheus-echo`	`/v1/audio/transcriptions`	Audio transcription	Meetings, calls, voice notes, subtitles, speech-to-text.

Usage-based pricing

Pricing

Billed per actual usage. Chat and vision charge per input and output token, embeddings per input token, and audio per minute. Vision images count as input tokens, so there is no separate per-image fee.

Alias	Capability	Input	Output
`prometheus-spark`	Fast chat	$0.15 / 1M	$0.60 / 1M
`prometheus-lite`	Balanced chat	$0.10 / 1M	$0.40 / 1M
`prometheus-core`	Deep reasoning	$1.10 / 1M	$4.40 / 1M
`prometheus-vision`	Vision chat	$0.40 / 1M	$1.60 / 1M
`prometheus-atlas`	Embeddings	$0.02 / 1M	—
`prometheus-echo`	Audio transcription	$0.006 / minute	—

OpenAI-compatible client

OpenAI SDK

Set baseURL to the Prometheus /v1 endpoint and pass a Prometheus API key as the SDK key.

Client setup

import OpenAI from "openai"

export const prometheus = new OpenAI({
  apiKey: process.env.PROMETHEUS_API_KEY,
  baseURL: "https://api.getprometheus.org/v1",
})

Chat completion

const completion = await prometheus.chat.completions.create({
  model: "prometheus-core",
  messages: [
    { role: "system", content: "You are a precise product engineering agent." },
    { role: "user", content: "Summarize this incident and list next actions." },
  ],
  temperature: 0.2,
  max_tokens: 800,
})

console.log(completion.choices[0]?.message?.content)

Streaming chat

const stream = await prometheus.chat.completions.create({
  model: "prometheus-lite",
  messages: [{ role: "user", content: "Draft a release note." }],
  stream: true,
})

for await (const event of stream) {
  process.stdout.write(event.choices[0]?.delta?.content ?? "")
}

Vision chat

const completion = await prometheus.chat.completions.create({
  model: "prometheus-vision",
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "Read this exercise and explain the next step." },
        {
          type: "image_url",
          image_url: { url: "data:image/png;base64,..." },
        },
      ],
    },
  ],
})

console.log(completion.choices[0]?.message?.content)

Embeddings

const result = await prometheus.embeddings.create({
  model: "prometheus-atlas",
  input: [
    "Prometheus is an OpenAI-compatible model gateway.",
    "Agents should use stable prometheus-* aliases.",
  ],
})

console.log(result.data[0]?.embedding.length)

Audio transcription

import fs from "node:fs"

const transcript = await prometheus.audio.transcriptions.create({
  model: "prometheus-echo",
  file: fs.createReadStream("meeting.mp3"),
  response_format: "json",
})

console.log(transcript.text)

HTTP contract

Endpoints

Every /v1/* endpoint requires the bearer key. Usage is logged against the owning key for dashboard analytics.

POST /v1/chat/completions

Supports prometheus-spark, prometheus-lite, prometheus-core, and prometheus-vision. Required fields are model and at least one messages item. Optional fields include stream, temperature, max_tokens, top_p, tools, response_format, image content parts, and compatible extra OpenAI fields.

POST /v1/embeddings

Uses prometheus-atlas. Send input as a string, string array, token array, or token-array batch. Optional fields include encoding_format and dimensions.

POST /v1/audio/transcriptions

Uses prometheus-echo with multipart form data. Send file plus model. Optional fields are language, temperature, and response_format as json, text, verbose_json, srt, or vtt. Audio files are limited to 25 MB.

GET /v1/models

Returns the model aliases visible to the key in the standard OpenAI list shape: { object: "list", data: [...] }. Each model id is a Prometheus alias.

No SDK required

Raw HTTP examples

Use these when an agent cannot load the OpenAI SDK or needs a deterministic shell command.

List models

curl -sS https://api.getprometheus.org/v1/models \
  -H "Authorization: Bearer $PROMETHEUS_API_KEY"

Create chat completion

curl -sS https://api.getprometheus.org/v1/chat/completions \
  -H "Authorization: Bearer $PROMETHEUS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "prometheus-core",
    "messages": [
      { "role": "user", "content": "Give me a deployment checklist." }
    ],
    "temperature": 0.2,
    "max_tokens": 700
  }'

Analyze image

curl -sS https://api.getprometheus.org/v1/chat/completions \
  -H "Authorization: Bearer $PROMETHEUS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "prometheus-vision",
    "messages": [
      {
        "role": "user",
        "content": [
          { "type": "text", "text": "What is shown in this image?" },
          {
            "type": "image_url",
            "image_url": { "url": "data:image/png;base64,..." }
          }
        ]
      }
    ]
  }'

Create embeddings

curl -sS https://api.getprometheus.org/v1/embeddings \
  -H "Authorization: Bearer $PROMETHEUS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "prometheus-atlas",
    "input": ["billing webhook retry policy", "invoice payment failed"]
  }'

Transcribe audio

curl -sS https://api.getprometheus.org/v1/audio/transcriptions \
  -H "Authorization: Bearer $PROMETHEUS_API_KEY" \
  -F model=prometheus-echo \
  -F response_format=json \
  -F [email protected]

Failure shape

Errors & streaming

Validation and auth failures use an OpenAI-compatible envelope. Streaming chat emits standard server-sent events and rewrites every response model back to the Prometheus alias.

error envelope

{
  "error": {
    "message": "Missing API key. Provide it as Authorization: Bearer <key>.",
    "type": "invalid_request_error",
    "param": null,
    "code": "invalid_api_key"
  }
}

Agent rules

Use /v1 in the SDK base URL.
Use prom_sk_ keys only in the bearer header.
Prefer prometheus-core for planning and tool use.
Prefer prometheus-vision when messages include images.
Prefer prometheus-lite for repeatable high-volume work.
Prefer prometheus-spark for fast routing and short tasks.

Ready to test

Create a key, then verify in the playground.

Get API key Open playground