Prometheus Spark
A compact model for chat, simple actions, classification, and flows where latency matters.
- Embedded assistants
- Routing and short summaries
- Lightweight automation
Prometheus family
A model line for assistants, agents, image understanding, transcription, semantic search, and reasoning workflows with stable names, predictable latency, and a consistent developer experience.
Model line
Prometheus separates speed, depth, audio, and semantic representation plus vision, so every product can use the right model without exposing provider names or internal implementation details.
A compact model for chat, simple actions, classification, and flows where latency matters.
A balance of cost and quality for high-volume product responses and agents that run all day.
The main model for complex tasks, deep analysis, planning, and multi-step agents.
Image-aware chat for screenshots, forms, diagrams, and student homework photos.
Speech transcription and understanding for turning meetings, messages, and calls into actionable data.
Vector representations for search, recommendations, context retrieval, and agent memory.
Designed for production
Prometheus lets teams think in capabilities: speed, depth, voice, or embeddings. Behind the interface, the platform can optimize the engine for each task without changing the public product contract.
Turn voice into useful text for support, sales, and operations.
Find knowledge with embeddings and semantic context.
Plan, compare, and execute tasks with deeper models.
Deliver fast interactions with models optimized for product use.
Pricing
Transparent usage-based pricing per model. Text models bill by input and output tokens, embeddings by input tokens, and audio by the minute.
| Model | Capability | Input | Output |
|---|---|---|---|
| Prometheus Spark Fast | Chat, routing, classification | $0.15 / 1M tokens | $0.60 / 1M tokens |
| Prometheus Lite Efficient | High-volume product responses | $0.10 / 1M tokens | $0.40 / 1M tokens |
| Prometheus Core Reasoning | Planning, code, deep analysis | $1.10 / 1M tokens | $4.40 / 1M tokens |
| Prometheus Vision Vision | Image-aware chat and visual QA | $0.40 / 1M tokens | $1.60 / 1M tokens |
| Prometheus Atlas Embeddings | RAG, memory, semantic search | $0.02 / 1M tokens | — |
| Prometheus Echo Audio | Speech-to-text transcription | $0.006 / minute | — |
Prices in USD. Billed per actual usage with no minimums or seat fees.
Private access