"httpd failed" several times a day - How can I find out why?

My server goes down for a couple minutes several times a day.

Each time it happens I get an email like this:

httpd failed @ Thu Jul 29 06:07:57 2010. A restart was attempted automagically.
Service Check Method: [tcp connect]

Failure Reason: Unable to connect to port 80

I have no idea how to find the cause of this problem. Can someone point me in the right direction?

try to check your apache log files, In my linux box its path is like /var/log/httpd also try to use tail /var/log/httpd or tail /var/log/httpd_log

Without seeing any errors from /var/log/httpd/error_log we can’t know why the server might be failing - my guess would be a script or something causing Apache to crash?

I logged into my SSH as root and opened the file “/var/log/httpd/error_log”, but the file was completely empty. (0 bytes)

that may not be its exact name, it depends on what distro you are using (it’ll be something along those lines though)

It’s CentOS 5 (x86). According to a quick google search, that should be the right location for the error log.

Update: I found a massive error log file (/usr/local/apache/logs/error_log). It’s about 196 MB. Does this sound like the right log? If so, what should I be looking for in this file?

Have you compiled your own Apache? This isn’t the normal location for the pre-packaged version.

Look at towards the end of the file after a crash has happened.

Okay, I looked through the log, but the only errors that occur before everything goes offline are just “File does not exist” errors. Then several minutes later it just displays logs of httpd starting back up.


[Mon Aug 02 01:30:45 2010] [error] [client] File does not exist: /home/morthian/public_html/robots.txt
[Mon Aug 02 01:30:45 2010] [error] [client] File does not exist: /home/morthian/public_html/404.shtml
[Mon Aug 02 01:43:37 2010] [notice] suEXEC mechanism enabled (wrapper: /usr/local/apache/bin/suexec)
[Mon Aug 02 01:43:37 2010] [notice] ModSecurity for Apache/2.5.7 (http://www.modsecurity.org/) configured.
[Mon Aug 02 01:43:38 2010] [notice] Apache/2.2.11 (Unix) mod_ssl/2.2.11 OpenSSL/0.9.8j DAV/2 mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/ configured – resuming normal operations

First off, if possible, I would suggest trying a stock Apache package from the Centos repository rather than a custom build.

What about general logs /var/log/messages etc showing anything for the restart? Do you have software on there that does any monitoring / restarting if they are deemed to be too high load (like Monit)

That’ll be it, Look at the times in the log when a restart was initiated.

(ah beaten to it)

I’d probably install munin or similar to monitor memory use, 9/10 apache restarts are down to the oom killer chopping out processes due to lack of memory.

There are many messages about high system memory load…

That does seem like something that might cause my server to go offline. (I have a VPS.)
If that is the case, how do I find out what is using so much memory?

Look at the output of ‘top’, you can order it by memory usage.

If you are on a VPS, it could well be the limited resources causing issues there.

I see user “mysql” using 18.6% mem from command “mysqld”.
Nothing else even comes close to that much usage.
Could this be the problem?

Is there a way to find out more specifically what is using so much memory?
(like a specific php script)

Edit: As I watch the output of top, I also notice the “mysqld” command using large amounts of CPU, sometimes exceeding 100%.

In which case, I would guess that you have a database query that needs some work (if it locks up at 100% for periods of time) - SHOW PROCESSLIST in mysql will help you find any queries currently running.

MySQL does use quite a bit of memory with its query caches etc, maybe you will need to look at the config (/etc/my.cnf) to reduce the memory allowance if it is taking too much.

One of my sites has a lengthy list of mysql processes with the command “Sleep”, and no status or query information. What are these “Sleep” command processes?

All other sites look fine.

Edit: I noticed the “sleep” processes suddenly all disappeared after several minutes, but mysqld is still using 18.6% mem.

Sleep means the connection is open, but doing nothing - if these are persistant connections you may want to look at disabling this feature from the sites configuration (Persistant will use more memory long term by keeping the connections open) - especially bad on a VPS.

This is probably a topic now for the PHP section of the forums, but I really need to know which script is causing mysql to eat up all the resources like this.

Oh, also how do I prevent mysql from exceeding the CPU limit?
I have seen mysqld exceed 100% cpu several times now, sometimes going higher than 400%.

Maybe, Maybe not. I’d check for persistant coonnections for a start - turning them off if they are on, or checking into the scripts if they process for a long time.

I wouldn’t stress too much about the mysql cpu use - it’ll use as much cpu as it can get to perform the query as fast as possible, unless your top 15 minute load average is very high (the third load average value). If you’re on a xen vps then you can’t affect your neighbours, and if on a well set up openvz vps then unless your pinning the cpu with high long term load averages the host is unlikely to be concerned.

Mysqld is always going to use the most memory for a single process on a typical lamp setup (18% is fairly normal), however what is often an issue is the accumulative memory use of apache processes. If left to its default configuration its maximum spawned processes multiplied by typical process memory use (lots of modules enabled that are typically not needed also) will exceed available memory.