-
Scan API(LLM x2) Performance Test - VU 1000 (Tier 4)이코에코(Eco²)/Performance 2026. 1. 27. 15:28




Date: 2026-01-27 15:12 KST (06:12 UTC)
Test Type: k6 Scan Polling Test (Tier 4 Validation)
Target VUs: 1000
Purpose: OpenAI Tier 4 승격 후 Rate Limit 해소 검증
Snapshot: https://snapshots.raintank.io/dashboard/snapshot/QpoxrHswLzzh7CpkepGlP7IZkymTZTcXGrafana
snapshots.raintank.io
Executive Summary
OpenAI Tier 4 승격 + Auto-recharge 설정 후 VU 1000 재테스트 결과, 96.6% 성공률을 달성하여 이전 테스트(92.5%) 대비 4.1%p 개선되었습니다. Rate Limit(TPM) 에러와 Quota 소진 에러가 완전히 해소되었으며, 53건의 실패는 Worker 재시작 및 Polling Timeout으로 인한 것으로 분석됩니다.
지표 Tier 4 (현재) Tier 3 (이전) 개선 평가 성공률 96.6% 92.5% +4.1%p PASS 실패 건수 53 97 -45% 개선 HTTP 429 에러 0 1,017 -100% 해소 Quota 소진 0 166 -100% 해소 E2E P95 166.7s 175.4s -5% 개선 Throughput 378.7 req/m 301.1 req/m +26% 개선
1. Test Configuration
test_info: target_vus: 1000 test_script: k6-scan-polling-test.js endpoint: https://api.dev.growbin.app/api/v1/scan timestamp: 2026-01-27T06:12:42.942Z poll_timeout: 300s max_poll_attempts: 150 openai_tier: 4 # Upgraded from Tier 3 auto_recharge: enabledTest Timeline
Phase Time (UTC) Time (KST) Event Test Start 06:09:42 15:09:42 Ramp-up 시작 Worker Scale-up 06:09:50 15:09:50 1 → 3 replicas Steady State 06:10:12 15:10:12 VU 1000 도달 Worker Restart 06:10:53 15:10:53 Memory pressure Test End 06:12:42 15:12:42 JSON 결과 저장 Worker Scale-down 06:17:00 15:17:00 3 → 1 replicas
2. Test Results
2.1 Throughput & Success Rate
Metric Tier 4 Tier 3 (이전) Delta VU 600 Total Submitted 1,592 1,645 -3.2% 1,408 Total Completed 1,494 1,204 +24% 1,401 Total Failed 53 97 -45% 4 Success Rate 96.6% 92.5% +4.1%p 99.7% Reward Rate 0.0% 0.0% - 0.0% Throughput 378.7 req/m 301.1 req/m +26% 358.6 req/m 2.2 Latency Distribution
Metric Tier 4 Tier 3 (이전) Delta Target Status Scan Submit P95 863ms 787ms +10% < 500ms FAIL Poll P95 2,547ms 2,772ms -8% < 500ms FAIL E2E P95 166.7s 175.4s -5% < 30s FAIL E2E Average 117.6s 121.4s -3% < 20s FAIL 2.3 Polling Statistics
Metric Tier 4 Tier 3 (이전) Delta Total Poll Requests 73,070 74,922 -2.5% Avg Polls per Task 46.2 44.9 +3% Poll Interval ~2s ~2s -
3. Rate Limit Analysis (Tier 4 vs Tier 3)
3.1 OpenAI API Limits Comparison
Limit Type Tier 3 Tier 4 증가율 TPM (Tokens Per Minute) 2,000,000 4,000,000 +100% RPM (Requests Per Minute) 10,000 10,000 0% 3.2 Error Comparison
Error Type Tier 4 (현재) Tier 3 (이전) 개선 HTTP 429 Total 0 1,017 -100% Rate Limit (TPM) 0 361 -100% Quota Exhausted 0 166 -100% Answer Failed 0 225 -100% Vision Failed 0 19 -100% 3.3 TPM Usage Analysis
Tier 4 Test TPM Estimation: ├─ Completed tasks: 1,494 ├─ Estimated tokens/task: ~5,000 ├─ Total tokens (3min): ~7,470,000 ├─ TPM Average: ~2,490,000 tokens/min ├─ TPM Limit: 4,000,000 └─ Usage: 62% of limit (Safe Zone)
4. Failure Analysis (53건)
4.1 Failure Breakdown
원인 건수 비율 설명 Worker 재시작 ~30 57% 06:10:53 UTC 재시작으로 인한 Task 손실 Polling Timeout ~20 38% E2E > 300s 기타 ~3 5% 네트워크, Scale-down 시점 Total 53 100% - 4.2 Worker Restart Analysis
┌─────────────────────────────────────────────────────────────────────────┐ │ Worker Restart Timeline │ ├─────────────────────────────────────────────────────────────────────────┤ │ │ │ 06:09:42 ─── Test Start │ │ │ │ │ 06:09:50 ─── KEDA Scale-up (1 → 3) │ │ │ Workers: wtxmw, nt895, rbkz6 │ │ │ │ │ 06:10:53 ─── Worker Restart (Memory Pressure) │ │ │ wtxmw restarted │ │ │ → In-flight tasks lost (~30) │ │ │ │ │ 06:12:42 ─── Test End │ │ │ │ │ 06:17:00 ─── KEDA Scale-down (3 → 1) │ │ "All metrics below target" │ │ │ └─────────────────────────────────────────────────────────────────────────┘4.3 Rate Limit vs Worker Restart: 실패 원인 비교
테스트 Rate Limit 실패 Worker 재시작 실패 Timeout 실패 Tier 3 VU 1000 80건 (82%) 5건 (5%) 12건 (12%) Tier 4 VU 1000 0건 (0%) 30건 (57%) 20건 (38%)
5. Infrastructure Metrics
5.1 KEDA Scaling Events
Time (UTC) Component Replicas Trigger 06:09:50 scan-worker 1 → 3 scan.vision queue > 10 06:09:55 scan-api 1 → 3 CPU threshold 06:17:00 scan-worker 3 → 1 All metrics below target 5.2 RabbitMQ Queue Depth
Queue Peak Tier 3 Peak Delta scan.vision 532 795 -33% scan.rule 560 509 +10% scan.answer 221 312 -29% scan.reward 138 164 -16% 분석: Queue depth 감소 → Rate Limit 미발생으로 처리 속도 향상
5.3 Worker Resource Usage
Pod CPU Peak Memory Peak Note scan-worker-wtxmw 0.88 cores 1,026MB Original (restarted) scan-worker-nt895 0.89 cores 553MB Scale-up scan-worker-rbkz6 0.89 cores 550MB Scale-up scan-worker-canary 0.89 cores 1,047MB Canary 5.4 Redis Memory Usage
Instance Peak Tier 3 Peak Delta rfr-cache 204MB 167MB* +22% rfr-streams 59MB 58MB +2% rfr-pubsub 4MB 2MB - *Tier 3 테스트는 VU 600 기준
6. Prometheus Queries Used
6.1 CPU & Memory
# Worker CPU Usage sum(rate(container_cpu_usage_seconds_total{namespace="scan",pod=~"scan-worker.*",container="scan-worker"}[1m])) by (pod) # Worker Memory Usage sum(container_memory_working_set_bytes{namespace="scan",pod=~"scan-worker.*",container="scan-worker"}) by (pod) / 1024 / 10246.2 Queue & Scaling
# RabbitMQ Queue Depth rabbitmq_queue_messages{queue=~"scan.*"} # HPA Replica Count kube_horizontalpodautoscaler_status_current_replicas{horizontalpodautoscaler=~".*scan.*"}6.3 Test Time Range
Start: 2026-01-27T06:09:42Z (epoch: 1769494182) End: 2026-01-27T06:12:42Z (epoch: 1769494362)
7. Tier 4 Upgrade Impact Summary
7.1 Before vs After
지표 Tier 3 VU 1000 Tier 4 VU 1000 효과 성공률 92.5% 96.6% +4.1%p Rate Limit 에러 1,017건 0건 해소 Quota 소진 166건 0건 해소 Throughput 301.1 req/m 378.7 req/m +26% E2E P95 175.4s 166.7s -5% 7.2 Key Findings
- Rate Limit 완전 해소: TPM 4M 한도로 VU 1000 부하 수용
- Quota 소진 방지: Auto-recharge로 잔액 자동 충전
- 처리량 26% 향상: Rate Limit 대기 없이 연속 처리
- Queue Depth 감소: 평균 25% 감소로 처리 효율 향상
- 잔여 실패 원인: Worker 재시작(57%) + Polling Timeout(38%)
8. Appendix
8.1 Raw Test Output
{ "test_info": { "target_vus": 1000, "duration_seconds": 1769494126.207721 }, "results": { "total_submitted": 1592, "total_completed": 1494, "total_failed": 53, "success_rate": "96.6%", "reward_rate": "0.0%" }, "latency": { "scan_submit_p95": "863ms", "poll_p95": "2547ms", "e2e_p95": "166.7s", "e2e_avg": "117.6s" }, "polling": { "total_poll_requests": 73070, "avg_polls_per_task": "46.2" }, "throughput": { "requests_per_minute": "378.7 req/m" } }8.2 Worker Log Summary (Current Container)
Metric Main Worker Canary Total scan.vision succeeded 174 205 379 scan.rule succeeded 214 - 214+ scan.answer succeeded 250 272 522 scan.reward succeeded 302 - 302+ HTTP 429 errors 0 0 0 Rate limit retries 0 0 0 8.3 Related Files
- Test Script:
e2e-tests/performance/k6-scan-polling-test.js - Result JSON:
k6-scan-polling-vu1000-2026-01-27T06-12-42-942Z.json - Previous Test Report:
docs/blogs/tests/2026-01-27-scan-load-test-vu1000.md - Error Analysis:
docs/blogs/tests/2026-01-27-error-analysis.md
9. Conclusion
Tier 4 승격 효과 확정
항목 결과 Rate Limit 해소 TPM 4M으로 VU 1000 수용 Quota 소진 방지 Auto-recharge 정상 작동 성공률 개선 92.5% → 96.6% (+4.1%p) 처리량 향상 301.1 → 378.7 req/m (+26%) 잔여 과제
- Worker 안정성: Memory limit 증가로 재시작 방지 필요
- Polling Timeout: E2E P95 166.7s로 SLA(30s) 초과 지속
- 95% SLA 달성: Worker 재시작 방지 시 달성 가능
'이코에코(Eco²) > Performance' 카테고리의 다른 글
Scan API(LLM x2) Performance Test - VU 800 (0) 2026.01.27 Scan API(LLM x2) Performance Test - VU 700 (0) 2026.01.27 Scan API(LLM x2) Performance Test - VU 1000 (Tier 3) (0) 2026.01.27 Scan API(LLM x2) Performance Test - VU 600 (0) 2026.01.27 Scan API(LLM x2) Performance Test - VU 500 (0) 2026.01.27