Remedy for Performance Issues

March 4th, 2020

The colleagues will talk about this topic at the Virtual Big Techday on May 8th, 2020.

Black Friday: every year this top-selling day poses a unique challenge to our rapidly growing customer in the online retail industry. The webshop must operate reliably under an unprecedented load. Despite a hardware upgrade, in 2019 there were already short service disruptions observed at peak times prior to Black Friday. Trouble was clearly imminent.

We identified and resolved multiple performance bottlenecks. The configuration of the application was then optimised in an iterative fashion. This was done using a JMeter test plan with request distribution modelled on customer behaviour. Executing it from the cloud directly against the production servers enabled us to test different load scenarios. At the same time, technical and organisational measures were taken to ensure that customer shopping remained undisturbed. Elastic Stack, Prometheus & Grafana served as the main tools to identify and analyse backend metrics.

Ultimately the improvements yielded the desired result. The throughput doubled, but the system withstood this crucial phase without any issues. The new monitoring processes allows future performance problems to be detected, understood, and rapidly solved.