Abstract flame core surrounded by geometric compute layers

Prometheus family

AI models lit for real products.

A model line for assistants, agents, image understanding, transcription, semantic search, and reasoning workflows with stable names, predictable latency, and a consistent developer experience.

Explore models View architecture

6 models Text Vision Voice Embeddings

Model line

Choose the right flame for every task.

Prometheus separates speed, depth, audio, and semantic representation plus vision, so every product can use the right model without exposing provider names or internal implementation details.

Fast response

Prometheus Spark

A compact model for chat, simple actions, classification, and flows where latency matters.

Embedded assistants
Routing and short summaries
Lightweight automation

Efficiency

Prometheus Lite

A balance of cost and quality for high-volume product responses and agents that run all day.

Operational support
Content generation
Repeatable workflows

Reasoning

Prometheus Core

The main model for complex tasks, deep analysis, planning, and multi-step agents.

Extended reasoning
Code and analysis
Context-aware decisions

Vision

Prometheus Vision

Image-aware chat for screenshots, forms, diagrams, and student homework photos.

Visual tutoring
Screenshot analysis
Document inspection

Audio

Prometheus Echo

Speech transcription and understanding for turning meetings, messages, and calls into actionable data.

Speech to text
Notes and minutes
Call analysis

Semantics

Prometheus Atlas

Vector representations for search, recommendations, context retrieval, and agent memory.

RAG and memory
Semantic matching
Intelligent deduplication

Designed for production

A single interface for routing intelligence.

Prometheus lets teams think in capabilities: speed, depth, voice, or embeddings. Behind the interface, the platform can optimize the engine for each task without changing the public product contract.

Routing by task and budget

Stable names for internal teams

Text, vision, audio, and vector layers

assistant.reply Prometheus Lite

agent.plan Prometheus Core

image.inspect Prometheus Vision

audio.transcribe Prometheus Echo

context.embed Prometheus Atlas

Listen

Turn voice into useful text for support, sales, and operations.

Retrieve

Find knowledge with embeddings and semantic context.

Reason

Plan, compare, and execute tasks with deeper models.

Respond

Deliver fast interactions with models optimized for product use.

Pricing

Pay for tokens, not for complexity.

Transparent usage-based pricing per model. Text models bill by input and output tokens, embeddings by input tokens, and audio by the minute.

Model	Capability	Input	Output
Prometheus Spark Fast	Chat, routing, classification	$0.15 / 1M tokens	$0.60 / 1M tokens
Prometheus Lite Efficient	High-volume product responses	$0.10 / 1M tokens	$0.40 / 1M tokens
Prometheus Core Reasoning	Planning, code, deep analysis	$1.10 / 1M tokens	$4.40 / 1M tokens
Prometheus Vision Vision	Image-aware chat and visual QA	$0.40 / 1M tokens	$1.60 / 1M tokens
Prometheus Atlas Embeddings	RAG, memory, semantic search	$0.02 / 1M tokens	—
Prometheus Echo Audio	Speech-to-text transcription	$0.006 / minute	—

Prices in USD. Billed per actual usage with no minimums or seat fees.

Private access

Bring your product to the fire of Prometheus.

Get started