Need Strategy to handle Dead Links

I am forever changing my website, and always worry that I’ll have a dead link somewhere.

1.) Is there some way that I can find out if I have a “dead link(s)” on my website without having to click on every link on my site?

2.) Is there some automated way to gracefully handle “dead links” if I do mess up?

This wasn’t an issue when I only had 3-4 pages, but as my website expands, this is quickly getting out of control… :frowning:

Debbie

I know what you mean. I used to check each link from time to time and either remove or fix them as needed.

I wouldn’t call it “graceful”, but IMHO “automated” is much better.

Google webmaster tools shows broken links, dependent on when pages were crawled.

Total Validator detects broken links on a per page basis.

So I guess webmaster tools is the better of those, but it still seems there should be an even better way.

I’ve used this handy little application for checking for dead links on a website:

Find broken links on your site with Xenu’s Link Sleuth ™

You can set up an error document for Apache in your .htaccess


ErrorDocument 404 /my404.php

It will then show that page instead of the normal 404. You can then customize that 404 to include some text like “Where you looking for this? go here! Looking for that? Go there! Something else? Check our FAQ! Still not sure? Here’s our home!”, or include any search functionality you might have, etc.
A 404 with some nice that tries to help and guide people where to go from there is better than the lame white default apache 404
by the way, IE browsers have a tendency to show their own 404 if the size of the markup is below some value (I think it was 4k or something). Just put a whole bunch of lorem ipsum in HTML comments to stop it from doing that and show your 404 instead of it’s own which actually makes it look as if your website is totally broken (whoever came up that a browser wide 404 would be a good idea …)

A very good idea. But be careful if you’re going to do auto-redirects. You don’t want to give them something like admin.php for free.

So if I put that code in my .htaccess - which is currently located in my Web Root - and I also have a corresponding “my404.php” in my Web Root as well, then that is guaranteed to always come up if I have a broken link?? :-/

(I’d probably use my website template and just put a message in the center pane so it still looks like they are on the website but the particular page just isn’t coming up. That should be more than 4k.)

Debbie

guaranteed is such strong word, but yeah, it will always show /my404.php when a requested URI wasn’t found.
By the way, it’s 512 bytes (not 4k) to get IE to show your message instead of it’s own. Mind you that that’s only HTML; CSS, JS, images, etc don’t count!

I found this reference on 404 pages, worth a read IMO: User friendly 404 error messages reconsidered.

I would think that your suggestion would only work for the directory in which the .htaccess file exists. right?

Debbie

No, it works recursively over the complete documentroot.

/some/path/that/doesnt/exist will also trigger /my404.php

Cool. Something that makes sense and is easy for a change?! :cool:

Debbie

Link Checker Pro is the leading solution for website analysis and the detection of broken and other problem links.
here try this free trial version
Link Checker Pro - The leading website analysis and link verification tool
I hope this will help …

Debbie, you can override it if you want

If you have an .htaccess in / with

ErrorDocument /my404.php

and in /somefolder you have

ErrorDocument /somefolder/my404.php

/my404.php will be served everywhere, except for /somefolder and all its subfolders; there /somefolder/my404.php will be served.

Could be interesting if different subdirectories use different themes (colors, images, etc).

Just to be clear, we are talking about internal links ONLY with regard to .htaccess, right? If I link to sitepoint.com/gobbledegook, I’m going to get SP’s 404 regardless of what my .htaccess file says.

For internal 404’s, MODx users have the Notify404 extra, which will send you an email whenever a “not found” message is returned. I suppose other popular CMS’s have something comparable. (Personally I had to turn this off as myopic bots were triggering too many false positives.)

Yes that exactly right :slight_smile:

Yes I had that too but quickly turned it off because it kinda flooded my inbox with false positives like you say.

You can use Xenu’s Link Sleuth to check broken links and then you should do 301 redirect that you still get link value.

Check your error logs, and use mod_rewrite to redirect dead links to their new page or a page with similar content.

Allaire’s HomeSite used to do it across your pages. :cool: Man, I miss that program sometimes. (I know it got bundled in with Dreamweaver, but it just isn’t the same.)

The microsoft SEO toolkit is quite handy for this. For once something useful and free to come out of MSFT; www.microsoft.com/web/seo

I have Homesite installed in this very machine. :wink:

But like a few others, I use Xenu. It works a charm!

Any open-source solutions out there??

(I’m on a Mac and don’t do Windows or proprietary software.)

Debbie