LLM 주요 모델 가격 지도 (2026년 3월 기준)

Harness/research 2026. 3. 18. 16:31

Date: 2026-03-18
Grounding: 공식 가격 페이지에서 직접 추출. 2차 출처와 교차 검증 완료.
목적: REODE Multi LLM Provider 참고 자료로 사용
단위: USD per 1M tokens

1. OpenAI

출처: https://www.tldl.io/resources/llm-api-pricing-2026 (3월 갱신)

모델	Input	Output	Cache Input	Context	Status
GPT-5.2 Pro	$21.00	$168.00	$2.10	200K	Current
GPT-5.2	$1.75	$14.00	$0.175	200K	Current
GPT-5	$1.25	$10.00	$0.125	128K	Current
GPT-5 Mini	$0.25	$2.00	$0.025	200K	Current
GPT-5 Nano	$0.05	$0.40	$0.005	128K	Current
o4-mini	$1.10	$4.40	$0.275	200K	Current
o3	$2.00	$8.00	$1.00	200K	Current
o3-pro	$20.00	$80.00	—	200K	Current
GPT-4.1	$2.00	$8.00	$0.20	1M	Current
GPT-4.1 Mini	$0.40	$1.60	$0.04	1M	Current
GPT-4.1 Nano	$0.10	$0.40	$0.01	1M	Current
o1	$15.00	$60.00	$7.50	200K	Maintained
~~GPT-4o~~	—	—	—	—	Deprecated → GPT-4.1
~~GPT-4o mini~~	—	—	—	—	Deprecated → GPT-4.1 Nano

2. Anthropic

출처: https://www.tldl.io/resources/llm-api-pricing-2026

모델	Input	Output	Cache Input	Context	Status
Claude Opus 4.6	$5.00	$25.00	$0.50	200K	Current
Claude Sonnet 4.6	$3.00	$15.00	$0.30	200K	Current
Claude Haiku 4.5	$1.00	$5.00	$0.10	200K	Current
~~Claude 3.5 Sonnet~~	—	—	—	—	Deprecated
~~Claude 3 Haiku~~	—	—	—	—	Deprecated

3. Google Gemini

출처: https://www.tldl.io/resources/llm-api-pricing-2026

모델	Input	Output	Cache Input	Context	Status
Gemini 3.1 Pro	$2~4	$12~18	—	200K+	Current
Gemini 3 Flash	$0.50	$3.00	—	—	Current
Gemini 2.5 Pro (≤200K)	$1.25	$10.00	$0.125	2M	Current
Gemini 2.5 Pro (>200K)	$2.50	$15.00	$0.25	2M	Current
Gemini 2.5 Flash	$0.30	$2.50	$0.03	1M	Current
Gemini 2.5 Flash-Lite	$0.10	$0.40	—	1M	Current
Gemini 2.0 Flash	$0.10	$0.40	$0.025	1M	Maintained

4. Zhipu (GLM)

출처: https://docs.z.ai/guides/overview/pricing (공식)

모델	Input	Output	Context	Status
GLM-5	$1.00	$3.20	200K	Current (2026-02)
GLM-5-Turbo	$1.20	$4.00	200K	Current (2026-03-16)
GLM-5-Code	$1.20	$5.00	200K	Current
GLM-4.7	$0.60	$2.20	128K	Maintained
GLM-4.7-FlashX	$0.07	$0.40	128K	Current
GLM-4.7-Flash	Free	Free	128K	Current
GLM-4.6	$0.60	$2.20	128K	Maintained
GLM-4.5	$0.60	$2.20	128K	Maintained
GLM-4.5-X	$2.20	$8.90	128K	Maintained
GLM-4.5-Air	$0.20	$1.10	128K	Maintained
GLM-4.5-AirX	$1.10	$4.50	128K	Maintained
GLM-4.5-Flash	Free	Free	128K	Current
GLM-4-32B-0414-128K	$0.10	$0.10	128K	Maintained
~~GLM-4~~	—	—	—	Legacy
~~GLM-3~~	—	—	—	Deprecated

Cache: 전 모델 cached input 80% 할인. Storage 한시적 무료.

5. MiniMax

출처: https://pricepertoken.com/pricing-page/provider/minimax (교차 검증)

모델	Input	Output	Context	Status	출시
MiniMax-M2.5	$0.25	$0.95	197K	Current	2026-02
MiniMax-M2 Her	$0.30	$1.20	66K	Current	2026-01
MiniMax-M2.1	$0.27	$0.95	197K	Maintained	2025-12
MiniMax-M2	$0.255	$1.00	197K	Maintained	2025-10
MiniMax-M1	$0.40	$1.76	1M	Legacy	2025-06
MiniMax-01	$0.20	$1.10	1M	Maintained	—
~~abab6.5~~	—	—	—	Deprecated

6. DeepSeek

출처: https://api-docs.deepseek.com/quick_start/pricing (공식)

모델	Input	Output	Cache Input	Context	Status
deepseek-chat (V3.2 non-thinking)	$0.28	$0.42	$0.028	128K	Current
deepseek-reasoner (V3.2 thinking)	$0.28	$0.42	$0.028	128K	Current
DeepSeek V3.2 Speciale	$0.40	$1.20	—	128K	Current
~~DeepSeek V3.1~~	—	—	—	—	→ V3.2 통합
~~DeepSeek V3~~	—	—	—	—	→ V3.2 통합
~~DeepSeek R1~~	—	—	—	—	→ V3.2 reasoning 통합

7. Qwen (Alibaba)

출처: https://pricepertoken.com/pricing-page/provider/qwen (교차 검증)

모델	Input	Output	Context	Status
Qwen3.5-397B-A17B	$0.39	$0.90	262K	Current (2026-02)
Qwen3.5-122B-A10B	$0.26	$2.08	262K	Current
Qwen3.5 Plus	$0.26	$1.56	1M	Current
Qwen3.5-Flash	$0.07	$0.26	1M	Current
Qwen3.5-35B-A3B	$0.163	$1.00	262K	Current
Qwen3.5-27B	$0.195	$1.56	262K	Current
Qwen3.5-9B	$0.05	$0.15	262K	Current
Qwen3 Max	$0.78	$3.90	262K	Current
Qwen3-235B-A22B	$0.455	$1.82	—	Current
Qwen3-30B-A3B	$0.08	$0.28	41K	Current
Qwen3 32B	$0.08	$0.24	41K	Current
Qwen3 14B	$0.06	$0.20	41K	Current
Qwen3 8B	$0.05	$0.20	41K	Current
Qwen-Plus	$0.26	$0.78	1M	Maintained
~~Qwen-Max~~ (legacy)	$1.04	$4.16	33K	Legacy

Qwen3 Coder 계열 (pricepertoken 기준):

모델	Input	Output	Context	Status
Qwen3 Coder (480B-A35B)	$0.22	$0.90	262K	Current
Qwen3 Coder-Next	$0.50	$1.20	262K	Current

8. Kimi (Moonshot)

출처: https://costgoat.com/pricing/kimi-api (교차 검증)

모델	Input	Output	Context	Status
Kimi K2.5	$0.60	$3.00	262K	Current (2026-01)
Kimi K2 Thinking	$0.60	$2.50	262K	Current
Kimi K2 0905	$0.60	$2.50	262K	Maintained
Kimi K2 0711	$0.60	$2.50	131K	Maintained
Kimi K2 Turbo	$1.15	$8.00	262K	Current
Kimi K2 Thinking Turbo	$1.15	$8.00	262K	Current
Moonshot V1 8K	$0.20	$2.00	8K	Legacy
Moonshot V1 32K	$1.00	$3.00	32K	Legacy
Moonshot V1 128K	$2.00	$5.00	131K	Legacy

Cache: $0.15/M (75% 절감). Web search $0.005/call.

9. Mistral (코딩 모델 포함)

출처: https://pricepertoken.com/pricing-page/provider/mistral-ai (2026-03-17 갱신)

모델	Input	Output	Context	Status
Devstral Small 2505	Free	Free	128K	Current
Mistral Nemo	$0.02	$0.04	131K	Current
Ministral 3B	$0.04	$0.04	128K	Current
Mistral Small 3.2 24B	$0.06	$0.18	131K	Current
Devstral Small 1.1	$0.07	$0.28	131K	Current
Mistral Small 3.1 24B	$0.10	$0.30	128K	Current
Ministral 8B	$0.10	$0.10	128K	Current
Codestral 2508	$0.30	$0.90	256K	Current
Devstral 2 2512	$0.40	$0.90	262K	Current
Devstral Medium	$0.40	$2.00	131K	Current
Mistral Medium 3.1	$0.40	$2.00	131K	Current
Mistral Large 3 2512	$0.50	$1.50	262K	Current
Magistral Medium	$2.00	$5.00	40K	Current

10. xAI (Grok)

출처: https://www.tldl.io/resources/llm-api-pricing-2026

모델	Input	Output	Context	Status
Grok 4	$3.00	$15.00	2M	Current
Grok 4.1 Fast	$0.20	$0.50	2M	Current

11. 코스트 효율 랭킹 (Output ≤$2/M, Tool Calling 지원)

Output 가격 기준 오름차순:

#	모델	Provider	Output $/M	Input $/M	Context	특기
1	GLM-4-32B-0414	Zhipu	$0.10	$0.10	128K	최저가 실용급
2	Qwen3.5-9B	Alibaba	$0.15	$0.05	262K	초경량, 262K
3	Mistral Small 3.2	Mistral	$0.18	$0.06	131K	유럽 서버
4	Qwen3 8B	Alibaba	$0.20	$0.05	41K	초경량
5	Qwen3.5-Flash	Alibaba	$0.26	$0.07	1M	1M context
6	Qwen3-30B-A3B	Alibaba	$0.28	$0.08	41K	MoE 경량
7	Devstral Small 1.1	Mistral	$0.28	$0.07	131K	코딩 특화
8	Mistral Small 3.1	Mistral	$0.30	$0.10	128K	범용
9	GPT-5 Nano	OpenAI	$0.40	$0.05	128K	OpenAI 최저가
10	GPT-4.1 Nano	OpenAI	$0.40	$0.10	1M	1M context
11	Gemini 2.5 Flash-Lite	Google	$0.40	$0.10	1M	1M context
12	GLM-4.7-FlashX	Zhipu	$0.40	$0.07	128K	GLM 고속
13	DeepSeek V3.2	DeepSeek	$0.42	$0.28	128K	가성비 왕, chat+reasoning
14	Grok 4.1 Fast	xAI	$0.50	$0.20	2M	2M context 최대
15	Qwen-Plus	Alibaba	$0.78	$0.26	1M	범용
16	Qwen3 Coder	Alibaba	$0.90	$0.22	262K	코딩 전용 MoE
17	Devstral 2	Mistral	$0.90	$0.40	262K	SWE-bench 72.2%
18	MiniMax M2.5	MiniMax	$0.95	$0.25	197K	에이전트 호환
19	MiniMax-01	MiniMax	$1.10	$0.20	1M	1M context
20	Mistral Large 3	Mistral	$1.50	$0.50	262K	범용 대형
21	Qwen3.5 Plus	Alibaba	$1.56	$0.26	1M	1M, 최신
22	MiniMax M1	MiniMax	$1.76	$0.40	1M	Legacy
23	Kimi K2 0905	Moonshot	$1.90	$0.39	262K	200+ tool call 안정

무료 모델

모델	Provider	Context	Status
GLM-4.7-Flash	Zhipu	128K	Current
GLM-4.5-Flash	Zhipu	128K	Current
Devstral Small 2505	Mistral	128K	Current
Llama 4	Meta (via providers)	200K	Current

12. REODE 용도별 추천

용도	추천 모델	Output $/M	이유
메인 에이전트	Claude Opus 4.6	$25.00	최고 성능, 기존 통합
코딩 서브에이전트	Qwen3 Coder ($0.90) 또는 Devstral 2 ($0.90)	$0.90	코딩 특화, tool calling
경량 서브에이전트	DeepSeek V3.2 ($0.42)	$0.42	최저가 실용급
대량 병렬 호출	Qwen3.5-9B ($0.15) 또는 GLM-4-32B ($0.10)	$0.10~0.15	초저가
장문 컨텍스트	Grok 4.1 Fast ($0.50, 2M) 또는 GPT-4.1 Nano ($0.40, 1M)	$0.40~0.50	1M+ context
무료 테스트	GLM-4.7-Flash	Free	개발/테스트용

Sources (공식 1차 소스)

Z.AI Pricing (GLM) — 공식 가격 페이지에서 직접 추출
DeepSeek Pricing — 공식 API 문서에서 직접 추출
MiniMax Pricing — pricepertoken.com 교차 검증
Qwen Pricing — pricepertoken.com + alibabacloud.com 교차 검증
Kimi Pricing — platform.moonshot.ai 교차 검증
Mistral Pricing — 2026-03-17 갱신 확인
TLDL LLM Pricing March 2026 — OpenAI/Anthropic/Google/xAI
pricepertoken.com Cheapest — 전체 비교
Awesome Agents Pricing — 교차 검증

'Harness > research' 카테고리의 다른 글

cmux: AI 코딩 에이전트 네이티브 터미널 (0)	2026.03.23
REODE: 멀티 LLM 프로바이더 SDK 호환성 점검 (1)	2026.03.18
Karpathy's AgentHub: 에이전트 네이티브 인프라는 DAG다 (1)	2026.03.12
Karpathy's autoresearch: MD 3개로 ML 실험 자율 수행 루프 구축 (0)	2026.03.12
Skill: Karpathy Patterns, 자율 에이전트 설계 원칙 (0)	2026.03.12

ABOUT ME

mango_fr 개발기 mango_fr 개발기

1. OpenAI

2. Anthropic

3. Google Gemini

4. Zhipu (GLM)

5. MiniMax

6. DeepSeek

7. Qwen (Alibaba)

8. Kimi (Moonshot)

9. Mistral (코딩 모델 포함)

10. xAI (Grok)

11. 코스트 효율 랭킹 (Output ≤$2/M, Tool Calling 지원)

무료 모델

12. REODE 용도별 추천

Sources (공식 1차 소스)

'Harness > research' 카테고리의 다른 글

티스토리툴바

ABOUT ME

1. OpenAI

2. Anthropic

3. Google Gemini

4. Zhipu (GLM)

5. MiniMax

6. DeepSeek

7. Qwen (Alibaba)

8. Kimi (Moonshot)

9. Mistral (코딩 모델 포함)

10. xAI (Grok)

11. 코스트 효율 랭킹 (Output ≤$2/M, Tool Calling 지원)

무료 모델

12. REODE 용도별 추천

Sources (공식 1차 소스)

'Harness > research' 카테고리의 다른 글

관련글 관련글 더보기

티스토리툴바