Back to selected work
Performance

Peak-Traffic Performance Testing for Microservices

Extended and matured a microservices performance testing platform to withstand high-traffic peak events.

Aspire IT Services

3,000Concurrent VUsers
M+Requests per Hour
100sof Endpoints Tested

The Problem

The team anticipated high-traffic periods, but lacked confidence in how the system would perform under load. Existing performance testing was limited, with no clear targets or sufficient observability into system behavior under stress. Microservices were tightly coupled, increasing the risk of cascading failures. I was brought in to evolve and scale the performance testing program, strengthening coverage, defining benchmarks, and improving visibility ahead of peak periods.

Performance Testing Architecture

Gatling Scripts Scala · Ramp / Spike / Sustained load scenarios 3,000 VUs Concurrent virtual users M+ requests / hour Microservices Auth Service User Service Product API Search Service Order / Checkout Notification Service + more Grafana CPU · Memory · p95 Latency Throughput · Error Rate New Relic APM Distributed traces Service-level bottlenecks Bottlenecks Found Baselines established Fixes prioritised Auto-Published Reports GitHub Pages · HTML

Wrote Gatling scripts in Scala covering hundreds of endpoints across all microservices. Designed three scenario types: ramp-up (gradual user growth), sustained load (steady-state at target TPS), and spike (sudden 3,000 VU burst). Monitored CPU, memory, p95 latency, throughput, and error rates simultaneously using Grafana dashboards and New Relic APM traces. Automated test report generation and publishing to GitHub Pages — so the team had full results immediately after every run without any manual steps.

Business Impact

  • 3,000 concurrent virtual users generating millions of API requests per hour — revealing real bottlenecks before any traffic event hit production
  • Multiple microservice-level bottlenecks identified and fixed before peak periods — preventing cascading failures in production
  • System performance baselines established for the first time: latency targets, max throughput, memory ceilings per service
  • Automated report pipeline meant the entire team had access to test results instantly — no manual reporting overhead
Stack: Gatling (Scala)GrafanaNew Relic AWSDockerJenkinsCypressGoogle Lighthouse