-
Scan API(LLM x2) Performance Test - VU 900이코에코(Eco²)/Performance 2026. 1. 27. 16:59





Date: 2026-01-27 16:43 KST (07:43 UTC)
Test Type: k6 Scan Polling Test
Target VUs: 900
OpenAI Tier: 4 (TPM 4,000,000)
Snapshot: https://snapshots.raintank.io/dashboard/snapshot/WBoZVhb5eXwfaFcphMAubQwrEMXJrPFpGrafana
snapshots.raintank.io
Executive Summary
VU 900 테스트에서 99.7% 성공률을 달성하였습니다. VU 800과 동일한 성공률을 유지하면서 처리량이 10% 향상되었습니다.
Rate Limit 에러 없이 안정적으로 처리되었으며, 총 요청수 증가 대비 시스템 추가 부하가 적어 RPM이 개선된 점이 인상적입니다.
지표 값 VU 800 대비 평가 성공률 99.7% 0% PASS 실패 건수 4 0 PASS HTTP 429 에러 0 - 정상 E2E P95 149.6s +3% FAIL (SLA) Throughput 405.5 req/m +10% 개선
1. Test Configuration
test_info: target_vus: 900 test_script: k6-scan-polling-test.js endpoint: https://api.dev.growbin.app/api/v1/scan timestamp: 2026-01-27T07:43:04.402Z poll_timeout: 300s max_poll_attempts: 150 openai_tier: 4Test Timeline
Phase Time (UTC) Time (KST) Event Test Start 07:40:04 16:40:04 Ramp-up 시작 Worker Scale-up 07:40:16 16:40:16 1 → 3 replicas API Scale-up 07:40:26 16:40:26 1 → 2 replicas API Scale-up 07:40:56 16:40:56 2 → 3 replicas Steady State 07:40:34 16:40:34 VU 900 도달 Test End 07:43:04 16:43:04 JSON 결과 저장
2. Test Results
2.1 Throughput & Success Rate
Metric VU 900 VU 800 VU 1000 Total Submitted 1,540 1,386 1,592 Total Completed 1,532 1,378 1,494 Total Failed 4 4 53 Success Rate 99.7% 99.7% 96.6% Reward Rate 0.0% 0.0% 0.0% Throughput 405.5 req/m 367.3 req/m 378.7 req/m 2.2 Latency Distribution
Metric Value VU 800 Target Status Scan Submit P95 635ms 734ms < 500ms FAIL Poll P95 2,494ms 2,110ms < 500ms FAIL E2E P95 149.6s 144.6s < 30s FAIL E2E Average 102.2s 97.9s < 20s FAIL 2.3 Polling Statistics
Metric Value VU 800 Total Poll Requests 63,499 52,845 Avg Polls per Task 41.2 38.1 Poll Interval ~2s ~2s
3. Error Analysis
3.1 Error Summary
Error Type Count Status HTTP 429 (Rate Limit) 0 정상 Rate Limit Retries 0 정상 Quota Exhausted 0 정상 Answer Failed 0 정상 Vision Failed 0 정상 3.2 Failure Breakdown (4건)
원인 건수 비율 설명 Polling Timeout ~4 100% E2E > 300s Rate Limit 0 0% Tier 4 TPM 4M 내 Quota 소진 0 0% Auto-recharge Total 4 100% - 3.3 TPM Usage Analysis
VU 900 TPM Estimation: ├─ Completed tasks: 1,532 ├─ Estimated tokens/task: ~5,000 ├─ Total tokens (3min): ~7,660,000 ├─ TPM Average: ~2,553,000 tokens/min ├─ TPM Limit: 4,000,000 (Tier 4) └─ Usage: 64% of limit (Yellow Zone)
4. Component Logs
4.1 Scan Worker Logs
Worker Task Processing (07:42-07:43 UTC): ├─ scan.answer succeeded: ~9-40s per task ├─ scan.reward succeeded: ~0.03-0.5s per task ├─ Rate Limit Errors: 0 └─ All tasks completed successfullyRate Limit Check per Pod:
Pod Rate Limit Errors scan-worker-78f8ccdc9-5srpg 0 scan-worker-78f8ccdc9-9w4gm 0 scan-worker-78f8ccdc9-ffqdc 0 scan-worker-canary 0 Total 0 4.2 Event Router Logs
INFO: GET /ready HTTP/1.1 200 OK INFO: GET /health HTTP/1.1 200 OK INFO: GET /metrics HTTP/1.1 200 OKStatus: 정상 (Health/Ready probe 모두 성공)
4.3 SSE Gateway Logs
INFO: GET /ready HTTP/1.1 200 OK INFO: GET /health HTTP/1.1 200 OK INFO: GET /metrics HTTP/1.1 200 OKStatus: 정상 (Health/Ready probe 모두 성공)
4.4 Redis State (Post-Test)
Instance Current Peak Max Status rfr-cache 306.30MB 512.26MB 512MB 정상 (누적) rfr-streams 70.35MB 75.48MB - 정상 rfr-pubsub 2.07MB 9.01MB - 정상 Peak 512.26MB는 VU 500~1000 연속 테스트의 누적 결과이며, VU 900 단독 테스트의 메모리 사용량은 아닙니다.
5. Infrastructure Metrics
5.1 KEDA Scaling Events
Time Component Replicas Trigger 07:40:16 scan-worker 1 → 3 scan.vision queue > 10 07:40:26 scan-api 1 → 2 CPU threshold 07:40:56 scan-api 2 → 3 CPU threshold 5.2 Worker Resource Usage (Post-Test)
Pod CPU Memory Node scan-worker-78f8ccdc9-5srpg 136m 491Mi k8s-worker-ai scan-worker-78f8ccdc9-9w4gm 229m 601Mi k8s-worker-ai-2 scan-worker-78f8ccdc9-ffqdc 139m 564Mi k8s-worker-ai-2 scan-worker-canary 98m 446Mi k8s-worker-ai 5.3 API Resource Usage (Post-Test)
Pod CPU Memory Node scan-api-589c5d4c69-6phnw 5m 207Mi k8s-api-scan scan-api-589c5d4c69-95cm8 5m 144Mi k8s-api-scan scan-api-589c5d4c69-jkjq5 5m 135Mi k8s-api-scan scan-api-canary 5m 104Mi k8s-api-scan 5.4 Worker Restart History
Pod Restarts Note scan-worker-78f8ccdc9-5srpg 2 테스트 중 재시작 scan-worker-canary 7 누적 재시작 (4일간)
6. VU Progression Summary
6.1 VU별 성능 비교 (Tier 4)
VU 요청 수 완료 실패 성공률 E2E P95 Throughput Rate Limit 600 1,408 1,401 4 99.7% 108.3s 358.6 req/m 0 700 1,496 1,313 11 99.2% 122.3s 329.1 req/m 0 800 1,386 1,378 4 99.7% 144.6s 367.3 req/m 0 900 1,540 1,532 4 99.7% 149.6s 405.5 req/m 0 1000 1,592 1,494 53 96.6% 166.7s 378.7 req/m 0 6.2 운영 권장 범위 (Tier 4 기준) - 업데이트
범위 VU 성공률 E2E P95 TPM 사용률 Green Zone 50-400 99.9%+ < 65s < 40% Yellow Zone 400-600 99.5%+ 65-110s 40-55% Orange Zone 600-800 99%+ 110-145s 55-60% Red Zone 800-900 99%+ 145-150s 60-65% Critical Zone 900-1000 96%+ 150-170s 65-70%
7. Key Findings
7.1 성능 안정성
항목 VU 800 VU 900 변화 성공률 99.7% 99.7% 유지 실패 건수 4 4 유지 Rate Limit 0 0 유지 7.2 처리량 향상
VU 900에서 VU 800 대비 처리량 10% 향상 (367.3 → 405.5 req/m)
- Workers 3개가 이미 Warm 상태로 유지
- KEDA Scale-up 지연 없이 즉시 처리
7.3 Redis Cache 메모리 현황
┌─────────────────────────────────────────────────────────────────────────┐ │ Redis Cache Memory Status │ ├─────────────────────────────────────────────────────────────────────────┤ │ │ │ Current: 306.30MB ████████████████████░░░░░░░░░░░░░░░░ 60% │ │ Peak: 512.26MB (VU 500~1000 연속 테스트 누적) │ │ Max: 512.00MB ─────────────────────────────────── LIMIT │ │ │ │ ✓ VU 900 단독 테스트 시 정상 범위 내 동작 │ │ │ └─────────────────────────────────────────────────────────────────────────┘
8. Recommendations
8.1 현재 상태 평가
항목 상태 비고 Rate Limit 정상 TPM 64% 사용 Quota 정상 Auto-recharge 작동 성공률 양호 99.7% > 95% SLA 레이턴시 초과 E2E P95 149.6s > 30s Redis Cache 정상 306.30MB (Peak는 연속 테스트 누적) 8.2 권장 조치
Action Priority Expected Impact minReplicaCount 1 → 2 HIGH Cold Start 방지 maxReplicaCount 3 → 5 MEDIUM 더 높은 VU 지원
9. Appendix
9.1 Raw Test Output
{ "test_info": { "target_vus": 900, "duration_seconds": 1769499557.739096 }, "results": { "total_submitted": 1540, "total_completed": 1532, "total_failed": 4, "success_rate": "99.7%", "reward_rate": "0.0%" }, "latency": { "scan_submit_p95": "635ms", "poll_p95": "2494ms", "e2e_p95": "149.6s", "e2e_avg": "102.2s" }, "polling": { "total_poll_requests": 63499, "avg_polls_per_task": "41.2" }, "throughput": { "requests_per_minute": "405.5 req/m" } }9.2 Prometheus Query Reference
# Time Range start: 2026-01-27T07:40:04Z end: 2026-01-27T07:43:04Z # Worker CPU sum(rate(container_cpu_usage_seconds_total{namespace="scan",pod=~"scan-worker.*"}[1m])) by (pod) # Worker Memory sum(container_memory_working_set_bytes{namespace="scan",pod=~"scan-worker.*"}) by (pod) / 1024 / 1024 # Queue Depth rabbitmq_queue_messages{queue=~"scan.*"} # Redis Memory redis_memory_used_bytes{namespace="redis"}9.3 Related Files
- Test Script:
e2e-tests/performance/k6-scan-polling-test.js - Result JSON:
k6-scan-polling-vu900-2026-01-27T07-43-04-402Z.json - VU 800 Report:
docs/blogs/tests/2026-01-27-scan-load-test-vu800.md - VU 1000 Tier 4 Report:
docs/blogs/tests/2026-01-27-scan-load-test-vu1000-tier4.md
10. Conclusion
VU 900 테스트 결과 요약
항목 결과 성공률 99.7% (PASS) Rate Limit 0건 (Tier 4 정상) 실패 원인 Polling Timeout (4건) TPM 사용률 ~64% (Yellow Zone) Redis Cache 306.30MB (정상) - VU 900 안정적 처리: 99.7% 성공률 유지 (VU 800과 동일)
- 처리량 향상: VU 800 대비 10% 증가 (405.5 req/m)
- Rate Limit 없음: Tier 4 TPM 64% 사용으로 안전
- 성능 안정성: VU 700~900 구간에서 일관된 99%+ 성공률
'이코에코(Eco²) > Performance' 카테고리의 다른 글
Scan API(LLM x2) Performance Test - VU 1000 [minReplica:2, maxReplica:5] (0) 2026.01.28 Scan API(LLM x2) Performance Test - VU 800 (0) 2026.01.27 Scan API(LLM x2) Performance Test - VU 700 (0) 2026.01.27 Scan API(LLM x2) Performance Test - VU 1000 (Tier 4) (0) 2026.01.27 Scan API(LLM x2) Performance Test - VU 1000 (Tier 3) (0) 2026.01.27