-
LLM 주요 모델 가격 지도 (2026년 3월 기준)Harness/research 2026. 3. 18. 16:31
Date: 2026-03-18
Grounding: 공식 가격 페이지에서 직접 추출. 2차 출처와 교차 검증 완료.
목적: REODE Multi LLM Provider 참고 자료로 사용
단위: USD per 1M tokens
1. OpenAI
출처: https://www.tldl.io/resources/llm-api-pricing-2026 (3월 갱신)
모델 Input Output Cache Input Context Status GPT-5.2 Pro $21.00 $168.00 $2.10 200K Current GPT-5.2 $1.75 $14.00 $0.175 200K Current GPT-5 $1.25 $10.00 $0.125 128K Current GPT-5 Mini $0.25 $2.00 $0.025 200K Current GPT-5 Nano $0.05 $0.40 $0.005 128K Current o4-mini $1.10 $4.40 $0.275 200K Current o3 $2.00 $8.00 $1.00 200K Current o3-pro $20.00 $80.00 — 200K Current GPT-4.1 $2.00 $8.00 $0.20 1M Current GPT-4.1 Mini $0.40 $1.60 $0.04 1M Current GPT-4.1 Nano $0.10 $0.40 $0.01 1M Current o1 $15.00 $60.00 $7.50 200K Maintained GPT-4o— — — — Deprecated → GPT-4.1 GPT-4o mini— — — — Deprecated → GPT-4.1 Nano 2. Anthropic
모델 Input Output Cache Input Context Status Claude Opus 4.6 $5.00 $25.00 $0.50 200K Current Claude Sonnet 4.6 $3.00 $15.00 $0.30 200K Current Claude Haiku 4.5 $1.00 $5.00 $0.10 200K Current Claude 3.5 Sonnet— — — — Deprecated Claude 3 Haiku— — — — Deprecated 3. Google Gemini
모델 Input Output Cache Input Context Status Gemini 3.1 Pro $2~4 $12~18 — 200K+ Current Gemini 3 Flash $0.50 $3.00 — — Current Gemini 2.5 Pro (≤200K) $1.25 $10.00 $0.125 2M Current Gemini 2.5 Pro (>200K) $2.50 $15.00 $0.25 2M Current Gemini 2.5 Flash $0.30 $2.50 $0.03 1M Current Gemini 2.5 Flash-Lite $0.10 $0.40 — 1M Current Gemini 2.0 Flash $0.10 $0.40 $0.025 1M Maintained 4. Zhipu (GLM)
모델 Input Output Context Status GLM-5 $1.00 $3.20 200K Current (2026-02) GLM-5-Turbo $1.20 $4.00 200K Current (2026-03-16) GLM-5-Code $1.20 $5.00 200K Current GLM-4.7 $0.60 $2.20 128K Maintained GLM-4.7-FlashX $0.07 $0.40 128K Current GLM-4.7-Flash Free Free 128K Current GLM-4.6 $0.60 $2.20 128K Maintained GLM-4.5 $0.60 $2.20 128K Maintained GLM-4.5-X $2.20 $8.90 128K Maintained GLM-4.5-Air $0.20 $1.10 128K Maintained GLM-4.5-AirX $1.10 $4.50 128K Maintained GLM-4.5-Flash Free Free 128K Current GLM-4-32B-0414-128K $0.10 $0.10 128K Maintained GLM-4— — — Legacy GLM-3— — — Deprecated Cache: 전 모델 cached input 80% 할인. Storage 한시적 무료.
5. MiniMax
출처: https://pricepertoken.com/pricing-page/provider/minimax (교차 검증)
모델 Input Output Context Status 출시 MiniMax-M2.5 $0.25 $0.95 197K Current 2026-02 MiniMax-M2 Her $0.30 $1.20 66K Current 2026-01 MiniMax-M2.1 $0.27 $0.95 197K Maintained 2025-12 MiniMax-M2 $0.255 $1.00 197K Maintained 2025-10 MiniMax-M1 $0.40 $1.76 1M Legacy 2025-06 MiniMax-01 $0.20 $1.10 1M Maintained — abab6.5— — — Deprecated 6. DeepSeek
모델 Input Output Cache Input Context Status deepseek-chat (V3.2 non-thinking) $0.28 $0.42 $0.028 128K Current deepseek-reasoner (V3.2 thinking) $0.28 $0.42 $0.028 128K Current DeepSeek V3.2 Speciale $0.40 $1.20 — 128K Current DeepSeek V3.1— — — — → V3.2 통합 DeepSeek V3— — — — → V3.2 통합 DeepSeek R1— — — — → V3.2 reasoning 통합 7. Qwen (Alibaba)
출처: https://pricepertoken.com/pricing-page/provider/qwen (교차 검증)
모델 Input Output Context Status Qwen3.5-397B-A17B $0.39 $0.90 262K Current (2026-02) Qwen3.5-122B-A10B $0.26 $2.08 262K Current Qwen3.5 Plus $0.26 $1.56 1M Current Qwen3.5-Flash $0.07 $0.26 1M Current Qwen3.5-35B-A3B $0.163 $1.00 262K Current Qwen3.5-27B $0.195 $1.56 262K Current Qwen3.5-9B $0.05 $0.15 262K Current Qwen3 Max $0.78 $3.90 262K Current Qwen3-235B-A22B $0.455 $1.82 — Current Qwen3-30B-A3B $0.08 $0.28 41K Current Qwen3 32B $0.08 $0.24 41K Current Qwen3 14B $0.06 $0.20 41K Current Qwen3 8B $0.05 $0.20 41K Current Qwen-Plus $0.26 $0.78 1M Maintained Qwen-Max(legacy)$1.04 $4.16 33K Legacy Qwen3 Coder 계열 (pricepertoken 기준):
모델 Input Output Context Status Qwen3 Coder (480B-A35B) $0.22 $0.90 262K Current Qwen3 Coder-Next $0.50 $1.20 262K Current 8. Kimi (Moonshot)
출처: https://costgoat.com/pricing/kimi-api (교차 검증)
모델 Input Output Context Status Kimi K2.5 $0.60 $3.00 262K Current (2026-01) Kimi K2 Thinking $0.60 $2.50 262K Current Kimi K2 0905 $0.60 $2.50 262K Maintained Kimi K2 0711 $0.60 $2.50 131K Maintained Kimi K2 Turbo $1.15 $8.00 262K Current Kimi K2 Thinking Turbo $1.15 $8.00 262K Current Moonshot V1 8K $0.20 $2.00 8K Legacy Moonshot V1 32K $1.00 $3.00 32K Legacy Moonshot V1 128K $2.00 $5.00 131K Legacy Cache: $0.15/M (75% 절감). Web search $0.005/call.
9. Mistral (코딩 모델 포함)
출처: https://pricepertoken.com/pricing-page/provider/mistral-ai (2026-03-17 갱신)
모델 Input Output Context Status Devstral Small 2505 Free Free 128K Current Mistral Nemo $0.02 $0.04 131K Current Ministral 3B $0.04 $0.04 128K Current Mistral Small 3.2 24B $0.06 $0.18 131K Current Devstral Small 1.1 $0.07 $0.28 131K Current Mistral Small 3.1 24B $0.10 $0.30 128K Current Ministral 8B $0.10 $0.10 128K Current Codestral 2508 $0.30 $0.90 256K Current Devstral 2 2512 $0.40 $0.90 262K Current Devstral Medium $0.40 $2.00 131K Current Mistral Medium 3.1 $0.40 $2.00 131K Current Mistral Large 3 2512 $0.50 $1.50 262K Current Magistral Medium $2.00 $5.00 40K Current 10. xAI (Grok)
모델 Input Output Context Status Grok 4 $3.00 $15.00 2M Current Grok 4.1 Fast $0.20 $0.50 2M Current
11. 코스트 효율 랭킹 (Output ≤$2/M, Tool Calling 지원)
Output 가격 기준 오름차순:
# 모델 Provider Output $/M Input $/M Context 특기 1 GLM-4-32B-0414 Zhipu $0.10 $0.10 128K 최저가 실용급 2 Qwen3.5-9B Alibaba $0.15 $0.05 262K 초경량, 262K 3 Mistral Small 3.2 Mistral $0.18 $0.06 131K 유럽 서버 4 Qwen3 8B Alibaba $0.20 $0.05 41K 초경량 5 Qwen3.5-Flash Alibaba $0.26 $0.07 1M 1M context 6 Qwen3-30B-A3B Alibaba $0.28 $0.08 41K MoE 경량 7 Devstral Small 1.1 Mistral $0.28 $0.07 131K 코딩 특화 8 Mistral Small 3.1 Mistral $0.30 $0.10 128K 범용 9 GPT-5 Nano OpenAI $0.40 $0.05 128K OpenAI 최저가 10 GPT-4.1 Nano OpenAI $0.40 $0.10 1M 1M context 11 Gemini 2.5 Flash-Lite Google $0.40 $0.10 1M 1M context 12 GLM-4.7-FlashX Zhipu $0.40 $0.07 128K GLM 고속 13 DeepSeek V3.2 DeepSeek $0.42 $0.28 128K 가성비 왕, chat+reasoning 14 Grok 4.1 Fast xAI $0.50 $0.20 2M 2M context 최대 15 Qwen-Plus Alibaba $0.78 $0.26 1M 범용 16 Qwen3 Coder Alibaba $0.90 $0.22 262K 코딩 전용 MoE 17 Devstral 2 Mistral $0.90 $0.40 262K SWE-bench 72.2% 18 MiniMax M2.5 MiniMax $0.95 $0.25 197K 에이전트 호환 19 MiniMax-01 MiniMax $1.10 $0.20 1M 1M context 20 Mistral Large 3 Mistral $1.50 $0.50 262K 범용 대형 21 Qwen3.5 Plus Alibaba $1.56 $0.26 1M 1M, 최신 22 MiniMax M1 MiniMax $1.76 $0.40 1M Legacy 23 Kimi K2 0905 Moonshot $1.90 $0.39 262K 200+ tool call 안정 무료 모델
모델 Provider Context Status GLM-4.7-Flash Zhipu 128K Current GLM-4.5-Flash Zhipu 128K Current Devstral Small 2505 Mistral 128K Current Llama 4 Meta (via providers) 200K Current
12. REODE 용도별 추천
용도 추천 모델 Output $/M 이유 메인 에이전트 Claude Opus 4.6 $25.00 최고 성능, 기존 통합 코딩 서브에이전트 Qwen3 Coder ($0.90) 또는 Devstral 2 ($0.90) $0.90 코딩 특화, tool calling 경량 서브에이전트 DeepSeek V3.2 ($0.42) $0.42 최저가 실용급 대량 병렬 호출 Qwen3.5-9B ($0.15) 또는 GLM-4-32B ($0.10) $0.10~0.15 초저가 장문 컨텍스트 Grok 4.1 Fast ($0.50, 2M) 또는 GPT-4.1 Nano ($0.40, 1M) $0.40~0.50 1M+ context 무료 테스트 GLM-4.7-Flash Free 개발/테스트용
Sources (공식 1차 소스)
- Z.AI Pricing (GLM) — 공식 가격 페이지에서 직접 추출
- DeepSeek Pricing — 공식 API 문서에서 직접 추출
- MiniMax Pricing — pricepertoken.com 교차 검증
- Qwen Pricing — pricepertoken.com + alibabacloud.com 교차 검증
- Kimi Pricing — platform.moonshot.ai 교차 검증
- Mistral Pricing — 2026-03-17 갱신 확인
- TLDL LLM Pricing March 2026 — OpenAI/Anthropic/Google/xAI
- pricepertoken.com Cheapest — 전체 비교
- Awesome Agents Pricing — 교차 검증
'Harness > research' 카테고리의 다른 글
cmux: AI 코딩 에이전트 네이티브 터미널 (0) 2026.03.23 REODE: 멀티 LLM 프로바이더 SDK 호환성 점검 (1) 2026.03.18 Karpathy's AgentHub: 에이전트 네이티브 인프라는 DAG다 (1) 2026.03.12 Karpathy's autoresearch: MD 3개로 ML 실험 자율 수행 루프 구축 (0) 2026.03.12 Skill: Karpathy Patterns, 자율 에이전트 설계 원칙 (0) 2026.03.12