ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • LLM 주요 모델 가격 지도 (2026년 3월 기준)
    Harness/research 2026. 3. 18. 16:31

    Date: 2026-03-18
    Grounding: 공식 가격 페이지에서 직접 추출. 2차 출처와 교차 검증 완료.
    목적: REODE Multi LLM Provider 참고 자료로 사용
    단위: USD per 1M tokens


    1. OpenAI

    출처: https://www.tldl.io/resources/llm-api-pricing-2026 (3월 갱신)

    모델 Input Output Cache Input Context Status
    GPT-5.2 Pro $21.00 $168.00 $2.10 200K Current
    GPT-5.2 $1.75 $14.00 $0.175 200K Current
    GPT-5 $1.25 $10.00 $0.125 128K Current
    GPT-5 Mini $0.25 $2.00 $0.025 200K Current
    GPT-5 Nano $0.05 $0.40 $0.005 128K Current
    o4-mini $1.10 $4.40 $0.275 200K Current
    o3 $2.00 $8.00 $1.00 200K Current
    o3-pro $20.00 $80.00 200K Current
    GPT-4.1 $2.00 $8.00 $0.20 1M Current
    GPT-4.1 Mini $0.40 $1.60 $0.04 1M Current
    GPT-4.1 Nano $0.10 $0.40 $0.01 1M Current
    o1 $15.00 $60.00 $7.50 200K Maintained
    GPT-4o Deprecated → GPT-4.1
    GPT-4o mini Deprecated → GPT-4.1 Nano

    2. Anthropic

    출처: https://www.tldl.io/resources/llm-api-pricing-2026

    모델 Input Output Cache Input Context Status
    Claude Opus 4.6 $5.00 $25.00 $0.50 200K Current
    Claude Sonnet 4.6 $3.00 $15.00 $0.30 200K Current
    Claude Haiku 4.5 $1.00 $5.00 $0.10 200K Current
    Claude 3.5 Sonnet Deprecated
    Claude 3 Haiku Deprecated

    3. Google Gemini

    출처: https://www.tldl.io/resources/llm-api-pricing-2026

    모델 Input Output Cache Input Context Status
    Gemini 3.1 Pro $2~4 $12~18 200K+ Current
    Gemini 3 Flash $0.50 $3.00 Current
    Gemini 2.5 Pro (≤200K) $1.25 $10.00 $0.125 2M Current
    Gemini 2.5 Pro (>200K) $2.50 $15.00 $0.25 2M Current
    Gemini 2.5 Flash $0.30 $2.50 $0.03 1M Current
    Gemini 2.5 Flash-Lite $0.10 $0.40 1M Current
    Gemini 2.0 Flash $0.10 $0.40 $0.025 1M Maintained

    4. Zhipu (GLM)

    출처: https://docs.z.ai/guides/overview/pricing (공식)

    모델 Input Output Context Status
    GLM-5 $1.00 $3.20 200K Current (2026-02)
    GLM-5-Turbo $1.20 $4.00 200K Current (2026-03-16)
    GLM-5-Code $1.20 $5.00 200K Current
    GLM-4.7 $0.60 $2.20 128K Maintained
    GLM-4.7-FlashX $0.07 $0.40 128K Current
    GLM-4.7-Flash Free Free 128K Current
    GLM-4.6 $0.60 $2.20 128K Maintained
    GLM-4.5 $0.60 $2.20 128K Maintained
    GLM-4.5-X $2.20 $8.90 128K Maintained
    GLM-4.5-Air $0.20 $1.10 128K Maintained
    GLM-4.5-AirX $1.10 $4.50 128K Maintained
    GLM-4.5-Flash Free Free 128K Current
    GLM-4-32B-0414-128K $0.10 $0.10 128K Maintained
    GLM-4 Legacy
    GLM-3 Deprecated

    Cache: 전 모델 cached input 80% 할인. Storage 한시적 무료.

    5. MiniMax

    출처: https://pricepertoken.com/pricing-page/provider/minimax (교차 검증)

    모델 Input Output Context Status 출시
    MiniMax-M2.5 $0.25 $0.95 197K Current 2026-02
    MiniMax-M2 Her $0.30 $1.20 66K Current 2026-01
    MiniMax-M2.1 $0.27 $0.95 197K Maintained 2025-12
    MiniMax-M2 $0.255 $1.00 197K Maintained 2025-10
    MiniMax-M1 $0.40 $1.76 1M Legacy 2025-06
    MiniMax-01 $0.20 $1.10 1M Maintained
    abab6.5 Deprecated  

    6. DeepSeek

    출처: https://api-docs.deepseek.com/quick_start/pricing (공식)

    모델 Input Output Cache Input Context Status
    deepseek-chat (V3.2 non-thinking) $0.28 $0.42 $0.028 128K Current
    deepseek-reasoner (V3.2 thinking) $0.28 $0.42 $0.028 128K Current
    DeepSeek V3.2 Speciale $0.40 $1.20 128K Current
    DeepSeek V3.1 → V3.2 통합
    DeepSeek V3 → V3.2 통합
    DeepSeek R1 → V3.2 reasoning 통합

    7. Qwen (Alibaba)

    출처: https://pricepertoken.com/pricing-page/provider/qwen (교차 검증)

    모델 Input Output Context Status
    Qwen3.5-397B-A17B $0.39 $0.90 262K Current (2026-02)
    Qwen3.5-122B-A10B $0.26 $2.08 262K Current
    Qwen3.5 Plus $0.26 $1.56 1M Current
    Qwen3.5-Flash $0.07 $0.26 1M Current
    Qwen3.5-35B-A3B $0.163 $1.00 262K Current
    Qwen3.5-27B $0.195 $1.56 262K Current
    Qwen3.5-9B $0.05 $0.15 262K Current
    Qwen3 Max $0.78 $3.90 262K Current
    Qwen3-235B-A22B $0.455 $1.82 Current
    Qwen3-30B-A3B $0.08 $0.28 41K Current
    Qwen3 32B $0.08 $0.24 41K Current
    Qwen3 14B $0.06 $0.20 41K Current
    Qwen3 8B $0.05 $0.20 41K Current
    Qwen-Plus $0.26 $0.78 1M Maintained
    Qwen-Max (legacy) $1.04 $4.16 33K Legacy

    Qwen3 Coder 계열 (pricepertoken 기준):

    모델 Input Output Context Status
    Qwen3 Coder (480B-A35B) $0.22 $0.90 262K Current
    Qwen3 Coder-Next $0.50 $1.20 262K Current

    8. Kimi (Moonshot)

    출처: https://costgoat.com/pricing/kimi-api (교차 검증)

    모델 Input Output Context Status
    Kimi K2.5 $0.60 $3.00 262K Current (2026-01)
    Kimi K2 Thinking $0.60 $2.50 262K Current
    Kimi K2 0905 $0.60 $2.50 262K Maintained
    Kimi K2 0711 $0.60 $2.50 131K Maintained
    Kimi K2 Turbo $1.15 $8.00 262K Current
    Kimi K2 Thinking Turbo $1.15 $8.00 262K Current
    Moonshot V1 8K $0.20 $2.00 8K Legacy
    Moonshot V1 32K $1.00 $3.00 32K Legacy
    Moonshot V1 128K $2.00 $5.00 131K Legacy

    Cache: $0.15/M (75% 절감). Web search $0.005/call.

    9. Mistral (코딩 모델 포함)

    출처: https://pricepertoken.com/pricing-page/provider/mistral-ai (2026-03-17 갱신)

     

    모델 Input Output Context Status
    Devstral Small 2505 Free Free 128K Current
    Mistral Nemo $0.02 $0.04 131K Current
    Ministral 3B $0.04 $0.04 128K Current
    Mistral Small 3.2 24B $0.06 $0.18 131K Current
    Devstral Small 1.1 $0.07 $0.28 131K Current
    Mistral Small 3.1 24B $0.10 $0.30 128K Current
    Ministral 8B $0.10 $0.10 128K Current
    Codestral 2508 $0.30 $0.90 256K Current
    Devstral 2 2512 $0.40 $0.90 262K Current
    Devstral Medium $0.40 $2.00 131K Current
    Mistral Medium 3.1 $0.40 $2.00 131K Current
    Mistral Large 3 2512 $0.50 $1.50 262K Current
    Magistral Medium $2.00 $5.00 40K Current

    10. xAI (Grok)

    출처: https://www.tldl.io/resources/llm-api-pricing-2026

    모델 Input Output Context Status
    Grok 4 $3.00 $15.00 2M Current
    Grok 4.1 Fast $0.20 $0.50 2M Current

    11. 코스트 효율 랭킹 (Output ≤$2/M, Tool Calling 지원)

    Output 가격 기준 오름차순:

    # 모델 Provider Output $/M Input $/M Context 특기
    1 GLM-4-32B-0414 Zhipu $0.10 $0.10 128K 최저가 실용급
    2 Qwen3.5-9B Alibaba $0.15 $0.05 262K 초경량, 262K
    3 Mistral Small 3.2 Mistral $0.18 $0.06 131K 유럽 서버
    4 Qwen3 8B Alibaba $0.20 $0.05 41K 초경량
    5 Qwen3.5-Flash Alibaba $0.26 $0.07 1M 1M context
    6 Qwen3-30B-A3B Alibaba $0.28 $0.08 41K MoE 경량
    7 Devstral Small 1.1 Mistral $0.28 $0.07 131K 코딩 특화
    8 Mistral Small 3.1 Mistral $0.30 $0.10 128K 범용
    9 GPT-5 Nano OpenAI $0.40 $0.05 128K OpenAI 최저가
    10 GPT-4.1 Nano OpenAI $0.40 $0.10 1M 1M context
    11 Gemini 2.5 Flash-Lite Google $0.40 $0.10 1M 1M context
    12 GLM-4.7-FlashX Zhipu $0.40 $0.07 128K GLM 고속
    13 DeepSeek V3.2 DeepSeek $0.42 $0.28 128K 가성비 왕, chat+reasoning
    14 Grok 4.1 Fast xAI $0.50 $0.20 2M 2M context 최대
    15 Qwen-Plus Alibaba $0.78 $0.26 1M 범용
    16 Qwen3 Coder Alibaba $0.90 $0.22 262K 코딩 전용 MoE
    17 Devstral 2 Mistral $0.90 $0.40 262K SWE-bench 72.2%
    18 MiniMax M2.5 MiniMax $0.95 $0.25 197K 에이전트 호환
    19 MiniMax-01 MiniMax $1.10 $0.20 1M 1M context
    20 Mistral Large 3 Mistral $1.50 $0.50 262K 범용 대형
    21 Qwen3.5 Plus Alibaba $1.56 $0.26 1M 1M, 최신
    22 MiniMax M1 MiniMax $1.76 $0.40 1M Legacy
    23 Kimi K2 0905 Moonshot $1.90 $0.39 262K 200+ tool call 안정

    무료 모델

    모델 Provider Context Status
    GLM-4.7-Flash Zhipu 128K Current
    GLM-4.5-Flash Zhipu 128K Current
    Devstral Small 2505 Mistral 128K Current
    Llama 4 Meta (via providers) 200K Current

    12. REODE 용도별 추천

    용도 추천 모델 Output $/M 이유
    메인 에이전트 Claude Opus 4.6 $25.00 최고 성능, 기존 통합
    코딩 서브에이전트 Qwen3 Coder ($0.90) 또는 Devstral 2 ($0.90) $0.90 코딩 특화, tool calling
    경량 서브에이전트 DeepSeek V3.2 ($0.42) $0.42 최저가 실용급
    대량 병렬 호출 Qwen3.5-9B ($0.15) 또는 GLM-4-32B ($0.10) $0.10~0.15 초저가
    장문 컨텍스트 Grok 4.1 Fast ($0.50, 2M) 또는 GPT-4.1 Nano ($0.40, 1M) $0.40~0.50 1M+ context
    무료 테스트 GLM-4.7-Flash Free 개발/테스트용

    Sources (공식 1차 소스)

    댓글

ABOUT ME

🎓 부산대학교 정보컴퓨터공학과 학사: 2017.03 - 2023.08
☁️ Rakuten Symphony Jr. Cloud Engineer, Full-time: 2024.12.09 - 2025.08.31
🏆 2025 AI 새싹톤 우수상 수상: 2025.10.30 - 2025.12.02
🌏 이코에코(Eco²) BE/AI(Harness)/Infra/FE 24-node E2E 고도화 및 운영, 2600만원 소모: 2025.12 - 2026.02
🪂 넥슨 AI 엔지니어(2-3년, 과제합 -> 면접 탈락), 무신사 AI-Native(전환형 인턴, 진행 X) 채용 프로세스: 2026.01.31 - 2026.03.05
🪂 GEODE/REODE 개발, Agentic Loop-based 자율 수행 하네스 + 도메인 특화 DAG(Plug-In), AI R&D Freelance @Pinxlab : 2026.03 - 2026.05

Designed by Mango