JavaScript - - By Craig Buckler

How to Stop Spam Harvesting With Email Obfuscation

Email harvest timeThe day I discovered the “mailto:” link was glorious. I could publish my address on a web page and anyone could email me with a single click. This was in the more innocent days of the web – before the spam harvesters took over. Use a “mailto:” today and your first viagra message will appear 30 seconds later. So how can you publish an email address without attracting unwanted attention from spammers?

The most obvious solution is to use a machine-unreadable email in your HTML, e.g. “bob (at) bobsdomain dot com”. Whilst this makes it difficult for spammers, it also makes it difficult for your users.

Another option is to generate the email address using JavaScript, perhaps with a little string concatenation or encoding e.g.


<p>contact 
<script type="text/javascript">
document.write('<a href="mai'+"lto"+"bob"+'@'+'bobsdomain.com">bob@'+"bobsdomain.com</a>");
</script>
</p>

This will stop most spammers, but anyone with JavaScript disabled will not see your address. (I would not recommend using document.write either.)

A better solution is to use a combination of techniques to thwart spammers without causing user difficulties. The first step is to use a human-readable but harvester-proof email address in our HTML. We will also make this a link to a contact page, e.g.


<p>Contact <a href="contact.html" class="email">bob (at) bobsdomain dot com</a></p>

Note that we have included a class of “email” so our link can be identified. The next step is to write a JavaScript function which searches your page for obfuscated emails and transforms them into real “mailto:” links. We will create a ’email.js’ file and include it in our HTML:


<script type="text/javascript" src="email.js"></script>

The required code is short, so we do not need a JavaScript library:

Content of email.js:


function EmailUnobsfuscate() {
	
	// find all links in HTML
	var link = document.getElementsByTagName && document.getElementsByTagName("a");
	var email, e;
	
	// examine all links
	for (e = 0; link && e < link.length; e++) {
	
		// does the link have use a class named "email"
		if ((" "+link[e].className+" ").indexOf(" email ") >= 0) {
		
			// get the obfuscated email address
			email = link[e].firstChild.nodeValue.toLowerCase() || "";
			
			// transform into real email address
			email = email.replace(/dot/ig, ".");
			email = email.replace(/(at)/ig, "@");
			email = email.replace(/s/g, "");
			
			// is email valid?
			if (/^[^@]+@[a-z0-9]+([_.-]{0,1}[a-z0-9]+)*([.]{1}[a-z0-9]+)+$/.test(email)) {
			
				// change into a real mailto link
				link[e].href = "mailto:" + email;
				link[e].firstChild.nodeValue = email;
		
			}
		}
	}
}

An explanation of the code:

  1. Line 4 fetches every <a> link in our HTML page and line 8 loops through them.
  2. Line 11 checks the link for a class of “email”.
  3. Line 14 grabs the obfuscated email from the text content of the node.
  4. Lines 17 to 19 transform it to a real email address using regular expressions: “dot” is changed to a “.”, “(at)” is changed to “@”, and all spaces are removed.
  5. Line 22 checks the resulting email address is valid.
  6. Lines 25 and 26 then modify the DOM node and make it into a real “mailto:” link.

Finally, we need to ensure the function runs on page load by adding a line to the bottom of email.js:


window.onload = EmailUnobsfuscate;

The result:

  • Our original HTML page contains no “mailto:” links and cannot be easily harvested by spammers.
  • The majority of users (those with JavaScript enabled) will see a standard email address and “mailto:” link.
  • Anyone not running JavaScript will see the readable “bob (at) bobsdomain dot com” address.

This intention of this article is to show the concept rather than real code. Although the example works, I suggest you:

  • Use your own obfuscated email format, e.g. “bob {@} bobsdomain -dot- com”. Spammers can read this article and transform encoded emails just as easily as you!
  • Use a different link identifier class – “email” is a little obvious!
  • Use a JavaScript library, such as jQuery, to make the function shorter. You should also ensure it copes with whitespace or other DOM nodes around the email address text (not handled in the code above).
  • Replace the window.onload with a more robust event handler.

Best of luck.

Sponsors
Login or Create Account to Comment
Login Create Account