Solving a huge site's downtime - Parapolitika.gr case study
Recently the maintainers of the big Greek news site Parapolitika, the guys from the Greek subsidiary of Tatchit contacted us asking for our help: the site was going down routinely for some reason after going live (it was rewritten on Orchard from the legacy engine). The Orchard application was sometimes using up all of the server's CPU (despite it being a 24-core beast) and crashing the IIS worker process in the end. This needed some urgent fix because websites tend to be only worthy if they're alive...
We immediately jumped into the task of getting the site stable! Neither the Orchard logs, neither the Windows Event Log revealed anything interesting. However soon we could experience the phenomena live: the worker process was eating up memory until at around 3,8GB while the CPU started spinning like mad and finally the process died. The Event Log told that ImageResizer.NET was running out of memory. Seriously? There are 32GBs of it, damn it!
The culprit was the worker process running on 32b, thus not able to use the whole huge memory. While such big memory usage is not something Orchard does everyday (a vanilla Orchard instance in a 32b worker process uses about 80MB) this solved the immediate issue quickly. Together with some other tweaks to the server config the site was now running stable, quickly reaching new uptime records (although the previous uptimes weren't too hard to beat).
In the newly gained peace we finally upgraded the site to Orchard 1.7.1 from 1.6 (the new version doesn't only give many features but also performs a way better). Meanwhile we also fixed an issue that could cause OutputCache to serve expired content.
To quote Sotirios Roussos, CEO of urbanIT whom we worked with closely on this emergency:
"After making some not demanding sites using Orchard, we decided to use it as CMS for creating the new parapolitika.gr, a really huge news site with more than 100.000 visitors daily and over 20 editors and a lot of content. It was a challenge for us and Orchard as well. Unfortunately, the first days were tough. Sudden breakdowns of site were appeared and the pressure was high. Orchard seemed to have limits, or maybe not? That's why we asked help for Lombiq, due to their experience into Orchard infrastructure. Fortunately, they did respond quick and spent hours and nights with us. Until we reach our goal. A stable and quick site.
And, we did it. Thanx Lombiq! Keep up the good work!"
It was a rush but we're really glad that we see a happy ending to this story!