LLM Pricing Calculator | Compare GPT-4, Claude, and Gemini Model Costs

Calculate by:

Input tokens:

Output tokens:

Number of API calls:

Provider	Model	Type	Description	Context	Modality	Qualities	Input (1M)	Output (1M)	Total price ▲
OpenAI	gpt-4.1-nano	General	GPT-4.1 nano is the fastest, most cost-effective GPT-4.1 model.	1,047,576	Input: Output:	Reasoning Speed Intelligence	$0.05	$0.20	Input: (84,000 tokens) $0.00 Output: (420,000 tokens) $0.08 Total: $0.09
Google	gemini-2.0-flash-lite	General	The smallest and most cost effective model, built for at scale usage.	1,048,576	Input: Output:	Reasoning Speed Intelligence	$0.08	$0.30	Input: (84,000 tokens) $0.01 Output: (420,000 tokens) $0.13 Total: $0.13
OpenAI	gpt-5-nano	General	GPT-5 nano is the fastest and most cost-effective GPT-5 model, designed for high-volume applications where speed and efficiency are paramount.	128,000	Input: Output:	Reasoning Speed Intelligence	$0.05	$0.40	Input: (84,000 tokens) $0.00 Output: (420,000 tokens) $0.17 Total: $0.17
Google	gemini-2.0-flash	General	The most balanced multimodal model with great performance across all tasks, with a 1 million token context window, and built for the era of Agents.	1,000,000	Input: Output:	Reasoning Speed Intelligence	$0.10	$0.40	Input: (84,000 tokens) $0.01 Output: (420,000 tokens) $0.17 Total: $0.18
Google	gemini-2.5-flash-preview	General	The first hybrid reasoning model which supports a 1M token context window and has thinking budgets. Preview model.	1,000,000	Input: Output:	Reasoning Speed Intelligence	$0.15	$0.60	Input: (84,000 tokens) $0.01 Output: (420,000 tokens) $0.25 Total: $0.26
Cloudflare	mistral-small-3.1-24b-instruct	General	Building upon Mistral Small 3 (2501), Mistral Small 3.1 (2503) adds state-of-the-art vision understanding and enhances long context capabilities up to 128k tokens without compromising text performance. With 24 billion parameters, this model achieves top-tier capabilities in both text and vision tasks.	128,000	Input: Output:	Reasoning Speed Intelligence	$0.35	$0.56	Input: (84,000 tokens) $0.03 Output: (420,000 tokens) $0.24 Total: $0.26
Cloudflare	gemma-3-12b-it	General	Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Gemma 3 models are multimodal, handling text and image input and generating text output, with a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions.	80,000	Input: Output:	Reasoning Speed Intelligence	$0.35	$0.56	Input: (84,000 tokens) $0.03 Output: (420,000 tokens) $0.24 Total: $0.26
Llama (AWS)	Llama 3.1 8B	General	Meta's compact and efficient Llama 3.1 model with 8B parameters. Offers great performance for lightweight applications. Available through AWS Bedrock.	128,000	Input: Output:	Reasoning Speed Intelligence	$0.45	$0.70	Input: (84,000 tokens) $0.04 Output: (420,000 tokens) $0.29 Total: $0.33
Cloudflare	llama-4-scout-17b-16e-instruct	General	Meta's Llama 4 Scout is a 17 billion parameter model with 16 experts that is natively multimodal. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding.	131,000	Input: Output:	Reasoning Speed Intelligence	$0.27	$0.85	Input: (84,000 tokens) $0.02 Output: (420,000 tokens) $0.36 Total: $0.38
Cloudflare	qwq-32b	Reasoning	QwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tuned models, QwQ, which is capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems. QwQ-32B is the medium-sized reasoning model, which is capable of achieving competitive performance against state-of-the-art reasoning models, e.g., DeepSeek-R1, o1-mini.	24,000	Input: Output:	Reasoning Speed Intelligence	$0.66	$1.00	Input: (84,000 tokens) $0.06 Output: (420,000 tokens) $0.42 Total: $0.48
OpenAI	gpt-4.1-mini	General	GPT-4.1 mini provides a balance between intelligence, speed, and cost that makes it an attractive model for many use cases.	1,047,576	Input: Output:	Reasoning Speed Intelligence	$0.40	$1.60	Input: (84,000 tokens) $0.03 Output: (420,000 tokens) $0.67 Total: $0.71
OpenAI	gpt-5-mini	General	GPT-5 mini is a faster, more cost-effective version of GPT-5, optimized for speed while maintaining high quality output for most use cases.	128,000	Input: Output:	Reasoning Speed Intelligence	$0.25	$2.00	Input: (84,000 tokens) $0.02 Output: (420,000 tokens) $0.84 Total: $0.86
Anthropic	claude-3.5-haiku	General	Anthropic's fastest model with great performance for diverse tasks.	200,000	Input: Output:	Reasoning Speed Intelligence	$0.80	$4.00	Input: (84,000 tokens) $0.07 Output: (420,000 tokens) $1.68 Total: $1.75
OpenAI	o4-mini	Reasoning	o4-mini is the latest small o-series model. It's optimized for fast, effective reasoning with exceptionally efficient performance in coding and visual tasks.	200,000	Input: Output:	Reasoning Speed Intelligence	$1.10	$4.40	Input: (84,000 tokens) $0.09 Output: (420,000 tokens) $1.85 Total: $1.94
OpenAI	o3-mini	Reasoning	Small reasoning model with high intelligence. Supports Structured Outputs, function calling, and Batch API.	200,000	Input: Output:	Reasoning Speed Intelligence	$1.10	$4.40	Input: (84,000 tokens) $0.09 Output: (420,000 tokens) $1.85 Total: $1.94
DeepSeek (AWS)	DeepSeek-R1	Reasoning	DeepSeek's fully managed reasoning-focused LLM with strong capabilities in problem solving, coding, and natural language understanding. Available through AWS Bedrock.	32,768	Input: Output:	Reasoning Speed Intelligence	$1.35	$5.40	Input: (84,000 tokens) $0.11 Output: (420,000 tokens) $2.27 Total: $2.38
OpenAI	gpt-4o-audio-preview	General	OpenAI's multimodal model with advanced audio processing capabilities for both input and output.	128,000	Input: Output:	Reasoning Speed Intelligence	$3.00	$6.00	Input: (84,000 tokens) $0.25 Output: (420,000 tokens) $2.52 Total: $2.77
OpenAI	gpt-4.1	General	GPT-4.1 is the flagship model for complex tasks. It is well suited for problem solving across domains.	1,047,576	Input: Output:	Reasoning Speed Intelligence	$2.00	$8.00	Input: (84,000 tokens) $0.17 Output: (420,000 tokens) $3.36 Total: $3.53
OpenAI	gpt-5	General	GPT-5 is OpenAI's most advanced model, designed for complex reasoning and creative tasks. It represents a significant leap forward in AI capabilities.	128,000	Input: Output:	Reasoning Speed Intelligence	$1.25	$10.00	Input: (84,000 tokens) $0.11 Output: (420,000 tokens) $4.20 Total: $4.31
Google	gemini-2.5-pro-preview	Reasoning	The state-of-the-art multipurpose model, which excels at coding and complex reasoning tasks. Preview model.	1,048,576	Input: Output:	Reasoning Speed Intelligence	$1.25	$10.00	Input: (84,000 tokens) $0.11 Output: (420,000 tokens) $4.20 Total: $4.31
Llama (AWS)	Llama 3.1 70B	General	Meta's Llama 3.1 70B parameter model offering strong reasoning capabilities and efficiency. Available through AWS Bedrock with latency-optimized inference.	128,000	Input: Output:	Reasoning Speed Intelligence	$3.50	$10.50	Input: (84,000 tokens) $0.29 Output: (420,000 tokens) $4.41 Total: $4.70
OpenAI	gpt-4o-mini-tts	Audio	Cost-effective model for text-to-speech. Processes text and generates audio responses. Best for shorter audio outputs (under 2 minutes).	128,000	Input: Output:	Reasoning Speed Intelligence	$0.60	$12.00	Input: (84,000 tokens) $0.05 Output: (420,000 tokens) $5.04 Total: $5.09
Anthropic	claude-3.5-sonnet	General	Anthropic's powerful, cost-effective model for complex tasks.	200,000	Input: Output:	Reasoning Speed Intelligence	$3.00	$15.00	Input: (84,000 tokens) $0.25 Output: (420,000 tokens) $6.30 Total: $6.55
Anthropic	claude-3.7-sonnet	Reasoning	Anthropic's most intelligent model and the first hybrid reasoning model on the market.	200,000	Input: Output:	Reasoning Speed Intelligence	$3.00	$15.00	Input: (84,000 tokens) $0.25 Output: (420,000 tokens) $6.30 Total: $6.55
Anthropic	claude-sonnet-4-20250514	General	A high-performance model with exceptional reasoning and efficiency	200,000	Input: Output:	Reasoning Speed Intelligence	$3.00	$15.00	Input: (84,000 tokens) $0.25 Output: (420,000 tokens) $6.30 Total: $6.55
Llama (AWS)	Llama 3.1 405B	Reasoning	Meta's largest Llama 3.1 model with 405B parameters. High-performance for complex reasoning and generation tasks. Available through AWS Bedrock with latency-optimized inference.	128,000	Input: Output:	Reasoning Speed Intelligence	$8.00	$24.00	Input: (84,000 tokens) $0.67 Output: (420,000 tokens) $10.08 Total: $10.75
OpenAI	o3	Reasoning	Powerful model for math, science, coding and visual tasks. Excels at technical writing and multi-step problems involving text, code, and images.	200,000	Input: Output:	Reasoning Speed Intelligence	$10.00	$40.00	Input: (84,000 tokens) $0.84 Output: (420,000 tokens) $16.80 Total: $17.64
Anthropic	claude-3-opus	Reasoning	Anthropic's most powerful model, designed for the most complex tasks.	200,000	Input: Output:	Reasoning Speed Intelligence	$15.00	$75.00	Input: (84,000 tokens) $1.26 Output: (420,000 tokens) $31.50 Total: $32.76
Anthropic	claude-opus-4-20250514	Reasoning	The most capable and intelligent model yet. Claude Opus 4 sets new standards in complex reasoning and advanced coding	200,000	Input: Output:	Reasoning Speed Intelligence	$15.00	$75.00	Input: (84,000 tokens) $1.26 Output: (420,000 tokens) $31.50 Total: $32.76

LLM and AI Model Pricing Calculator - Compare GPT-4, Claude, and Gemini Costs

Filter AI Models

Calculate by:

Input tokens:

Output tokens:

Number of API calls: