-
Scan API(LLM x2) Performance Test - VU 500이코에코(Eco²)/Performance 2026. 1. 27. 13:19






Date: 2026-01-27 04:00 KST
Test Type: k6 Scan Polling Test
Target VUs: 500
Grafana Snapshot: Dashboard LinkGrafana
snapshots.raintank.io
Executive Summary
지표 값 평가 성공률 100% PASS E2E P95 83.3s WARN
1. Test Configuration
test_info: target_vus: 500 test_script: k6-scan-polling-test.js endpoint: https://api.dev.growbin.app/api/v1/scan timestamp: 2026-01-27T04:00:09.595ZTest Timeline
Phase Time (UTC) Time (KST) Duration Ramp-up Start 03:57:10 12:57:10 - Steady State 03:57:40 12:57:40 2m Ramp-down Start 03:59:40 12:59:40 - Test End 04:01:13 13:01:13 - Total Duration - - ~4m Grafana Dashboard
Snapshot URL: View Dashboard
주요 패널:
- Scan API Request Rate
- Worker CPU/Memory Usage
- RabbitMQ Queue Depth
- E2E Latency Distribution
- KEDA Scaling Events
2. Test Results
2.1 Throughput & Success Rate
Metric Value Total Submitted 1,336 Total Completed 1,336 Total Failed 0 Success Rate 100.0% Reward Rate 0.0% Throughput 367.9 req/m 2.2 Latency Distribution
Metric Value Target Status Scan Submit P95 232ms < 500ms PASS Poll P95 349ms < 500ms PASS E2E P95 83.3s < 30s FAIL E2E Average 59.2s < 20s FAIL 2.3 Polling Statistics
Metric Value Total Poll Requests 38,860 Avg Polls per Task 29.1 Poll Interval ~2s (estimated)
3. Infrastructure Metrics
3.1 Cluster Overview
Test Time Snapshot (04:00 KST)
Scan Namespace Pods
Pod Status Restarts Node CPU Memory scan-worker-78f8ccdc9-wtxmw Running 1 k8s-worker-ai 25m 287Mi scan-worker-canary-ccd48f886-9df4l Running 1 k8s-worker-ai 24m 310Mi scan-api-589c5d4c69-g857f Running 0 k8s-api-scan 6m 149Mi scan-api-canary-76b4c99cb8-hk9fv Running 0 k8s-api-scan 5m 113Mi celery-beat-75c84f45fd-5sdxp Running 0 k8s-worker-ai-2 4m 133Mi Worker Node Resource (k8s-worker-ai)
Resource Usage Allocation CPU 389m (19%) 1305m requests / 8750m limits (437% overcommit) Memory 2.6Gi (69%) 1976Mi requests / 7296Mi limits 3.2 Scan Worker Configuration
deployment: replicas: 1 (+ 1 canary) container: image: mng990/eco2:scan-worker-dev-latest resources: requests: cpu: 250m memory: 512Mi limits: cpu: 1 memory: 1Gi celery: pool: gevent concurrency: 100 prefetch_multiplier: 1 queues: - scan.vision - scan.rule - scan.answer - scan.reward3.3 KEDA ScaledObject & HPA
ScaledObject Configuration
scan-worker-scaledobject: scaleTargetRef: Deployment/scan-worker minReplicaCount: 1 maxReplicaCount: 3 pollingInterval: 30s cooldownPeriod: 300s fallback: failureThreshold: 3 replicas: 2 triggers: - type: rabbitmq queueName: scan.vision value: "10" # Scale when queue > 10 - type: rabbitmq queueName: scan.answer value: "10" - type: rabbitmq queueName: scan.rule value: "20" behavior: scaleUp: stabilizationWindow: 0s policies: - type: Pods, value: 2, period: 30s scaleDown: stabilizationWindow: 300s policies: - type: Percent, value: 50, period: 60sKEDA Health Status (Post-Test)
Metric Status Failures s0-rabbitmq-scan-vision Happy 0 s1-rabbitmq-scan-answer Happy 0 s2-rabbitmq-scan-rule Happy 0 Scale Events During Test
Time Event Reason 04:00:10 UTC Last Active Triggers became active ~03:59 UTC Scale to 3 scan.visionqueue above target (10)HPA Event Log:
SuccessfulRescale: New size: 3; reason: external metric s0-rabbitmq-scan-vision above target3.4 RabbitMQ Status
Metric Value Pod eco2-rabbitmq-server-0 CPU 45m Memory 554Mi Queue Status Empty (post-test) 3.5 Redis Cluster Status
Redis Pods (Post-Test)
Pod Purpose CPU Memory Node rfr-streams-redis-0 Event Streams 10m 123Mi k8s-redis-streams rfr-pubsub-redis-0 Pub/Sub (Leader) 7m 70Mi k8s-redis-pubsub rfr-pubsub-redis-1 Pub/Sub (Replica) 8m 72Mi k8s-redis-pubsub rfr-pubsub-redis-2 Pub/Sub (Replica) 9m 67Mi k8s-redis-pubsub rfr-cache-redis-0 Cache 8m 218Mi k8s-redis-cache rfr-auth-redis-0 Auth Sessions 7m 66Mi k8s-redis-auth Redis Sentinel Pods
Pod CPU Memory rfs-streams-redis 8m 55Mi rfs-pubsub-redis (x3) 7-8m 51-52Mi rfs-cache-redis 8m 69Mi rfs-auth-redis 8m 51Mi Observation: Redis 클러스터는 테스트 중 안정적으로 유지됨. Cache Redis 메모리 사용량(218Mi)이 다른 인스턴스 대비 높음.
3.6 All Cluster Nodes Resource
Node CPU CPU% Memory Mem% Role k8s-worker-ai 389m 19% 2.6Gi 69% AI Workers k8s-worker-ai-2 76m 3% 2.1Gi 56% Celery Beat k8s-api-scan 69m 3% 2.4Gi 65% Scan API k8s-rabbitmq 119m 5% 2.0Gi 55% Message Queue k8s-redis-streams 108m 5% 1.2Gi 66% Redis Streams k8s-redis-pubsub 172m 8% 1.5Gi 86% Redis Pub/Sub k8s-logging 250m 12% 7.2Gi 94% ELK Stack Warning:
k8s-logging노드 메모리 94% 사용 중 - ELK 스택 리소스 점검 필요
4. Prometheus Metrics (Test Window: 03:57:10 ~ 04:01:13 UTC)
4.1 Scan Worker CPU Usage
Pod Start Peak End Trend scan-worker-78f8ccdc9-wtxmw 1.53 1.72 0.23 ↗↘ scan-worker-78f8ccdc9-jhrgg - 1.83 0.34 ↗↘ (scale-up) scan-worker-78f8ccdc9-sqcbr - 1.84 0.36 ↗↘ (scale-up) scan-worker-canary 1.46 1.74 0.35 ↗↘ Query:
sum(rate(container_cpu_usage_seconds_total{namespace="scan",pod=~"scan-worker.*"}[1m])) by (pod)4.2 Scan Worker Memory Usage
Pod Start Peak End scan-worker-wtxmw 750MB 1.30GB 1.17GB scan-worker-jhrgg 337MB 1.18GB 1.12GB scan-worker-sqcbr 481MB 1.24GB 1.14GB scan-worker-canary 1.42GB 1.62GB 1.55GB Query:
sum(container_memory_working_set_bytes{namespace="scan",pod=~"scan-worker.*"}) by (pod)4.3 HPA Replica Count
Time (UTC) scan-worker scan-api 03:57:10 3 1 03:57:40 3 2 03:58:10 3 3 04:01:13 3 3 Query:
kube_horizontalpodautoscaler_status_current_replicas{horizontalpodautoscaler=~".*scan.*"}Observation: scan-worker는 테스트 시작 전 이미 3 replicas로 스케일업됨 (이전 요청 영향)
4.4 RabbitMQ Queue Depth
Queue Start Peak End Peak Time scan.vision 352 352 0 03:57:10 scan.answer 14 258 0 03:58:10 scan.rule 73 268 0 03:57:40 scan.reward 2 155 0 03:58:40 Query:
rabbitmq_queue_messages{queue=~"scan.*"}Observation: 테스트 종료 시 모든 큐가 0으로 drain됨 (정상 처리 완료)
4.5 RabbitMQ Consumer Count
Queue Min Max Steady scan.vision 2 4 4 scan.answer 2 4 4 scan.rule 2 4 4 scan.reward 2 4 4 Query:
rabbitmq_queue_consumers{queue=~"scan.*"}4.6 Scan API CPU Usage
Pod Start Peak End Note scan-api-g857f 0.47 1.30 0.01 Original scan-api-vqwzw - 0.74 0.01 Scale-up +1 scan-api-pzppj - 0.63 0.01 Scale-up +2 scan-api-canary 0.01 0.01 0.01 Idle Query:
sum(rate(container_cpu_usage_seconds_total{namespace="scan",pod=~"scan-api.*"}[1m])) by (pod)4.7 Redis Memory Usage
Instance Start Peak End Delta rfr-streams 42.7MB 51.6MB 51.6MB +8.9MB rfr-cache 26.0MB 102MB 102MB +76MB rfr-pubsub-0 2.1MB 2.2MB 2.1MB - rfr-auth 1.0MB 1.0MB 1.0MB - Query:
redis_memory_used_bytes{namespace="redis"}Observation: rfr-cache 메모리가 테스트 중 4배 증가 (LLM 응답 캐싱)
4.8 All Workers Status
Namespace Worker Replicas Node Status scan scan-worker 1 k8s-worker-ai Running scan scan-worker-canary 1 k8s-worker-ai Running chat chat-worker 1 k8s-worker-ai Running character character-match-worker 2 k8s-worker-storage, storage-2 Running character character-worker 1 (+1 canary) k8s-worker-storage-2, storage Running auth auth-worker 1 (+1 canary) k8s-worker-storage-2, storage Running users users-worker 1 (+1 canary) k8s-worker-storage, storage-2 Running info info-worker 1 k8s-api-info Running Total Workers: 13 pods across 4 nodes
4.8 Event Router Configuration
redis: streams_url: redis://rfr-streams-redis.redis.svc.cluster.local:6379/0 pubsub_url: redis://pubsub-redis-master.redis.svc.cluster.local:6379/0
5. Worker Log Analysis
5.1 Task Processing Times (Sample from Logs)
Task Type Min Max Avg Note scan.vision 4.46s 6.21s 4.5s OpenAI Vision API scan.rule 0.04s 0.18s 0.04s Local lookup (Redis) scan.answer 4.85s 8.92s 6.7s OpenAI Chat Completion scan.reward 0.03s 0.05s 0.05s Character match + event Detailed Sample (03:59:38 - 03:59:52 KST)
03:59:38.311 scan.answer 4.847s task=fb2e58f9 03:59:38.469 scan.reward 0.035s task=bf38ff01 03:59:39.002 scan.rule 0.038s task=deeb6139 03:59:39.071 scan.reward 0.045s task=9a113200 03:59:39.570 scan.answer 6.685s task=b2c6ed05 03:59:39.901 scan.reward 0.048s task=66f5daa8 03:59:40.017 scan.vision 4.564s task=c5956559 (duration_vision_ms: 4466) 03:59:41.025 scan.reward 0.038s task=8b47d5bf 03:59:41.061 scan.answer 7.637s task=3037e333 03:59:42.502 scan.rule 0.042s task=236ccd0b (duration_rule_ms: 0.175) 03:59:44.630 scan.answer 6.899s task=d6f35d05 03:59:45.010 scan.answer 8.916s task=d64ee5b2 03:59:47.353 scan.answer 5.809s task=fd18b15e 03:59:52.583 scan.reward 0.055s task=4c96d003Task Statistics (from logs):
- Tasks processed in last hour: 574 succeeded
- Tasks processed in last 2 hours: 1,148 total (vision + rule + answer + reward)
- Estimated tasks during test window (4m): ~300 complete scan flows
5.2 Task Flow
scan.vision (4.5s avg) ↓ scan.rule (0.04s) ↓ scan.answer (6.7s avg) ↓ scan.reward (0.05s) ──────────────────── Total: ~11-12s per task (sequential)5.3 Observations from Logs
- LLM API Calls: 모든 OpenAI API 호출 성공 (HTTP 200)
- Reward Processing:
reward_character_event_dispatched정상 발생 - Character Match:
Character match completed로그 확인 - Worker Sync: 간헐적 heartbeat miss 발생 (users-worker-canary)
6. Bottleneck Analysis
6.1 E2E Latency Breakdown
┌─────────────────────────────────────────────────────────────┐ │ E2E Latency: 59.2s (avg) │ ├─────────────────────────────────────────────────────────────┤ │ │ │ [Queue Wait] [Vision] [Rule] [Answer] [Reward] │ │ ~~~~~~~~~~~~ ~~~~~~~~ ~~~~~~ ~~~~~~~~ ~~~~~~~~ │ │ ~47s 4.5s 0.04s 6.7s 0.05s │ │ (80%) (8%) (0%) (11%) (0%) │ │ │ └─────────────────────────────────────────────────────────────┘Primary Bottleneck: Queue Wait Time (~47s)
- Worker concurrency: 100 (gevent)
- 2 workers total (stable + canary)
- Max concurrent tasks: 200
- VU 500 with burst → Queue backlog
6.2 Why Queue Wait is High
- HPA Scale-up Delay: 부하 감지 후 스케일업까지 지연
- OpenAI Rate Limits: Vision API 동시 요청 제한 가능성
- gevent Greenlet Scheduling: I/O 대기 중 context switch 오버헤드
7. Comparison with Previous Tests
7.1 Today's Tests (2026-01-27)
Metric UTC 04:00) Change Submitted 1,336 +426% Completed 1,336 - Failed 0 -100% Success Rate 100% FIXED E2E P95 83.3s N/A* 7.2 vs Blog VU 500 Test (Reference: Scan API 부하 테스트 (300-500VU))
Metric VU 500 (25.12.29) VU 500 (Latest) Delta 비고 요청 수 1,990 1,336 -33% 테스트 시간 차이 완료율 94.0% 100% +6% 개선 성공률 99.7% 100% +0.3% - E2E P95 76.4s 83.3s +9% 악화 E2E Avg - 59.2s - - Throughput 438 req/m 367.9 req/m -16% - Scan API P95 154ms 232ms +51% - - 주요 개선점: Pub/Sub - SSE GW Connection 최적화, worker-ai 노드 +1
- 완료율 개선: 94% → 100% (+6%) - 시스템 안정화
- E2E P95 악화: 76.4s → 83.3s (+9%) - 대기열 증가
- Throughput 감소: 438 → 367.9 req/m (-16%) - 요청 수 차이
7.3 VU별 추이
VU 요청 수 완료율 E2E P95 Throughput 평가 50 685 99.7% 17.7s 198 req/m Baseline 200 1,649 99.8% 33.2s 367 req/m 정상 250 1,754 99.9% 40.5s 417.6 req/m 최적점 300 1,732 99.9% 48.5s 402 req/m 포화 시작 400 1,901 98.9% 62.2s 422.2 req/m 한계 근접 500 1,990 94.0% 76.4s 438 req/m SLA 초과 500(Latest) 1,336 100% 83.3s 367.9 req/m 안정화됨 결론:
- 안정성: 금일 테스트가 완료율 100%로 더 안정적
- 성능: E2E 레이턴시는 약간 증가 (+9%)
- 권장 VU: SLA(E2E < 60s) 기준 250-300 VU 권장
8. Appendix
8.1 Prometheus Query Reference
아래 쿼리들을 사용해 테스트 시간대 메트릭을 수집했습니다.
Metric PromQL Query Worker CPU sum(rate(container_cpu_usage_seconds_total{namespace="scan",pod=~"scan-worker.*"}[1m])) by (pod)Worker Memory sum(container_memory_working_set_bytes{namespace="scan",pod=~"scan-worker.*"}) by (pod)API CPU sum(rate(container_cpu_usage_seconds_total{namespace="scan",pod=~"scan-api.*"}[1m])) by (pod)HPA Replicas kube_horizontalpodautoscaler_status_current_replicas{horizontalpodautoscaler=~".*scan.*"}RabbitMQ Queue Depth rabbitmq_queue_messages{queue=~"scan.*"}RabbitMQ Consumers rabbitmq_queue_consumers{queue=~"scan.*"}Redis Memory redis_memory_used_bytes{namespace="redis"}Time Range Parameters:
start=1769486230 # 2026-01-27T03:57:10Z end=1769486473 # 2026-01-27T04:01:13Z step=30s8.2 Raw Test Output
{ "test_info": { "target_vus": 500, "duration_seconds": 1769486191.701171 }, "results": { "total_submitted": 1336, "total_completed": 1336, "total_failed": 0, "success_rate": "100.0%", "reward_rate": "0.0%" }, "latency": { "scan_submit_p95": "232ms", "poll_p95": "349ms", "e2e_p95": "83.3s", "e2e_avg": "59.2s" }, "polling": { "total_poll_requests": 38860, "avg_polls_per_task": "29.1" }, "throughput": { "requests_per_minute": "367.9 req/m" } }8.3 Test Environment
Component Version/Config k6 v0.49+ Kubernetes Self-managed (kubeadm, 24-node) Istio 1.24.1 RabbitMQ 3.x (Cluster Operator) Celery 5.x (gevent pool) OpenAI Model gpt-5.2 Prometheus kube-prometheus-stack 8.4 Related Files
- Test Script:
e2e-tests/performance/k6-scan-polling-test.js - Result JSON:
k6-scan-polling-vu500-2026-01-27T04-00-09-595Z.json - HPA Config:
k8s/scaling/scan-worker-scaledobject.yaml - Grafana Snapshot: Link
'이코에코(Eco²) > Performance' 카테고리의 다른 글
Scan API(LLM x2) Performance Test - VU 700 (0) 2026.01.27 Scan API(LLM x2) Performance Test - VU 1000 (Tier 4) (0) 2026.01.27 Scan API(LLM x2) Performance Test - VU 1000 (Tier 3) (0) 2026.01.27 Scan API(LLM x2) Performance Test - VU 600 (0) 2026.01.27 Scan API 600 VUs Load Test: 처리량 포화 분석 리포트 (0) 2025.12.29