I previously wrote about using jQuery to Strip All HTML Tags From a Div. Now if you want to remove all bad character from a HTML string (which may have been provided by a $.getScript() call or such).

This is how you can easily clean up your html and remove bad characters, it could be useful when you get the html from somewhere and you want to .match() for strings but the .match() throws an error because of bad characters. We can do this using regex and still retain our HTML tags like so:

//clean up string/HTML (remove bad chars but keep html tags)
rawData =  rawData.replace(/[<>^a-zA-Z 0-9]+/g,'');

If we wanted to be extra specific we could also remove other common characters which are not needed:

///clean up HTML ready to be used with match() statement
rawData =  rawData.replace(/[^/\"_+-<>=a-zA-Z 0-9]+/g,'');

cleanHTML() Function

I wrote this little function to help with the process of cleaning up the HMTL ready for using regex on it.

/* clean up HTML for use with .match() statement or regex */
var JQUERY4U = {};
	cleanUpHTML: function(html) {
		html = html.replace("'",'"');
		html = html.replace(/[^/\"_+-?!<>[]{}()=*.|a-zA-Z 0-9]+/g,'');
		return html;
var cleanedHTML = JQUERY4U.UTIL.cleanUpHTML(htmlString);

More Copy and Paste Regex Examples

Tags: jQuery clean up HTML
Sam Deering is a Front-end Web Developer who specialises in JavaScript & jQuery. Sam is driven and passionate about sharing his knowledge to educate others.

Free Guide:

How to Choose the Right Charting Library for Your Application

How do you make sure that the charting library you choose has everything you need? Sign up to receive this detailed guide from FusionCharts, which explores all the factors you need to consider before making the decision.

Special Offer
Free course!

Git into it! Bonus course Introduction to Git is yours when you take up a free 14 day SitePoint Premium trial.