Top-tier reasoning and analysis. Ideal for complex tasks, research, and advanced code generation.
Input
$5.00$1.14/M tokens
Output
$25.00$5.71/M tokens
Context1M
VisionYes
Claude Opus 4.7
77% Off
Anthropic
Model IDclaude-opus-4-7
Top-tier reasoning and analysis. Previous generation.
Input
$5.00$1.14/M tokens
Output
$25.00$5.71/M tokens
Context1M
VisionYes
Claude Opus 4.6
77% Off
Anthropic
Model IDclaude-opus-4-6
Top-tier reasoning and analysis. Earlier generation.
Input
$5.00$1.14/M tokens
Output
$25.00$5.71/M tokens
Context1M
VisionYes
GPT-5.5
91% Off
OpenAI
Model IDgpt-5.5
Flagship model with improved reasoning and multi-modal capabilities.
Input
$5.00$0.46/M tokens
Output
$30.00$2.29/M tokens
Context256K
VisionYes
Claude Sonnet 4.6
92% Off
Anthropic
Model IDclaude-sonnet-4-6
Best balance of performance and cost. Great for coding, writing, and general production tasks.
Input
$3.00$0.23/M tokens
Output
$15.00$1.14/M tokens
Context1M
VisionYes
GPT-5.4
91% Off
OpenAI
Model IDgpt-5.4
Best value flagship model. Excellent for production workloads with large context needs.
Input
$2.50$0.23/M tokens
Output
$15.00$1.14/M tokens
Context1M
VisionYes
Budget-Friendly Models
GPT-5.4 Mini
88% Off
OpenAI
Model IDgpt-5.4-mini
Smaller context, great price. Good for chat and quick completions.
Input
$0.75$0.09/M tokens
Output
$4.50$0.52/M tokens
Context128K
VisionYes
GPT-5.4 Nano
86% Off
OpenAI
Model IDgpt-5.4-nano
Cheapest OpenAI option. Perfect for high-volume, simple tasks.
Input
$0.20$0.03/M tokens
Output
$1.25$0.17/M tokens
Context64K
VisionNo
Claude Haiku 4.5
89% Off
Anthropic
Model IDclaude-haiku-4-5
Fast and affordable. Perfect for high-volume tasks, chat, and simple completions.
Input
$1.00$0.11/M tokens
Output
$5.00$0.57/M tokens
Context1M
VisionYes
DeepSeek V3.2
60% Off
DeepSeek
Model IDdeepseek-v3.2
Best value Chinese model. Great performance at low cost.
Input
$0.14$0.06/M tokens
Output
$0.28$0.11/M tokens
Context128K
VisionNo
DeepSeek R1
64% Off
DeepSeek
Model IDdeepseek-r1
Reasoning model. Excellent for math, logic, and complex problem-solving.
Input
$0.55$0.20/M tokens
Output
$2.19$0.80/M tokens
Context64K
VisionNo
Kimi K2
88% Off
Moonshot
Model IDkimi-k2
Long-context Chinese model. Great for document analysis and multilingual tasks.
Input
$0.93$0.11/M tokens
Output
$3.86$0.57/M tokens
Context256K
VisionYes
MiniMax-M2
90% Off
MiniMax
Model IDMiniMax-M2
Best value 1M context model. Perfect for long documents and large codebases.
Input
$0.60$0.06/M tokens
Output
$2.40$0.29/M tokens
Context1M
VisionYes
Xiaomi MiMo V2
80% Off
Xiaomi
Model IDmimo-v2
Cheapest option available. Great for bulk processing and high-volume automation.
Input
$0.14$0.03/M tokens
Output
$0.29$0.14/M tokens
Context1M
VisionYes
Chinese & Multilingual Models
Qwen 3 Max
63% Off
Alibaba
Model IDqwen3-max
Alibaba's flagship. Excellent for Chinese language tasks and large contexts.
Input
$0.40$0.15/M tokens
Output
$1.20$0.45/M tokens
Context1M
VisionYes
Qwen 3 Plus
50% Off
Alibaba
Model IDqwen3-plus
Balanced Chinese model. Good price-to-performance ratio.
Input
$0.16$0.08/M tokens
Output
$0.64$0.40/M tokens
Context1M
VisionYes
Zhipu GLM-4 Plus
20% Off
Zhipu
Model IDglm-4-plus
Chinese AI model. Great for coding and multimodal tasks.
Input
$0.50$0.15/M tokens
Output
$0.50$0.40/M tokens
Context128K
VisionYes
Doubao Pro
83% Off
ByteDance
Model IDdoubao-pro
ByteDance flagship. Strong in Chinese language and multimodal tasks.
Input
$0.69$0.11/M tokens
Output
$3.46$0.57/M tokens
Context128K
VisionYes
Stepfun Step-1V
70% Off
Stepfun
Model IDstep-1v
Multimodal Chinese model. Good for vision-language tasks.
Input
$0.19$0.06/M tokens
Output
$1.16$0.29/M tokens
Context256K
VisionYes
Google & Specialized Models
Gemini 3.1 Pro
73% Off
Google
Model IDgemini-3.1-pro
Google's flagship model with native multi-modal support and extended context.
Input
$3.50$0.57/M tokens
Output
$10.50$2.86/M tokens
Context1M
VisionYes
Gemini 3.1 Flash
32% Off
Google
Model IDgemini-3.1-flash
Fast and efficient. Great for high-volume tasks.
Input
$0.30$0.17/M tokens
Output
$1.20$0.86/M tokens
Context1M
VisionYes
Gemini 3.5 Flash
43% Off
Google
Model IDgemini-3.5-flash
Balanced performance. Good for general purpose use.
Input
$0.75$0.34/M tokens
Output
$3.00$1.71/M tokens
Context1M
VisionYes
Code & Image Generation
GPT-5.3 Codex
93% Off
OpenAI
Model IDgpt-5.3-codex
Specialized for code generation. Fastest output, strong coding capabilities.
Input
$1.75$0.11/M tokens
Output
$14.00$0.69/M tokens
Context128K
VisionNo
Use any model with one API key
Python
# Use selected models from the catalog with your API key
from openai import OpenAI
client = OpenAI(
base_url="https://api.venice.ai/api/v1",
api_key="your_apitokendeal_key"
)
models = ["claude-opus-4-8"]
for model in models:
response = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": "Hello"}]
)
print(model, response.choices[0].message.content)