Server load spiralling out of control


Starting a few days ago, we’ve run into some major issues with our forum and wiki. I’ll start at the beginning to give the necessary background.

The site is a fansite for a tv show (Game of thrones) and it’s rising popularity has been straining the server for some time. However, last weekend (when the show premiered its second season) it ran reasonably well after some tweaks and upgrades. We had brief outages, but they’d be intermittent slowness and not complete choking.

Starting … Friday, I think, things started going very badly. We were hit by a large number of requests from msnbot that took the site down for quite a while, forcing us to add an .htaccess block on the msnbot. Our host also did some other tweaks.

Since then, the site has been performing extremely poorly in what we estimate to be similar traffic to the previous weekend, where it did alright. Most alarming is that at times the server load will just spiral completely out of control, going from an average of 5-10 to as high as 200. Today, we had the forum running just fine with almost 1700 users, plus I don’t know how many users on the wiki, but then all of a sudden the load spiraled out of control.

When this happens, it just keeps going up. 256 web requests are left waiting, 300 MySql threads lock, and load climbs and climbs and climbs. Not having root access, we can’t stop and restart things, so our expedient is to rename the index files for the wiki and forum to kill hits to them… but load still climbs for a long while.

Given that the load keeps climbing after we’ve renamed those index file, we’re thinking it has to be a server configuration issue rather than an issue with the MediaWiki or the IPB Forum, but we have no clue where to start looking for the culprit. What information do we need to collect to be able to figure out what’s happening? As I said, we don’t have root access, but right now we don’t even know what questions to ask our host or what sort of information to ask our host for.

The whole thing is probably running on MySQL? Do you know if you’re using MyISAM or InnoDB tables?
Is there a search functionality on your website and if so is it used a lot?
How much data is added on a daily basis?
Is there any caching in place, like memcache or APC?
What hardware is all this running on?

As an intermediate solution you can ask your host if they could install Varnish on your server which is a reverse proxy that -when set up correctly- can drastically reduce load through caching.

Thank you for responding. :slight_smile: What ended up happening during the weekend, as the problems just got worse and worse, is that our host switched the system to lighthttpd instead and this seems to be working better, though the real test will be if it can stand up to next weekend.

We are looking at Varnish as well, though we were somewhat daunted by seeing a lot of people say its hard to setup correctly.

Yes servers like lighttpd and nginx are better at handling high request volumes than Apache. Especially when it comes to static files like images, CSS, etc.

Does any of the software support caching like Memcache? Worth looking at since it’s easier to setup than varnish, so it’s a quick win.

As an alternative to varnish you could also look at squid.