How do I remove trailing quote from URL using .htaccess RewriteRule

I don’t get tripped up all that often any more, but this has me stumped. Sometimes due to bad links on other sites requests hit my site with a trailing quote on the url. For example:

http://example.com/somepage.hml"
http://example.com/somepage.hml%22

Because of the trailing quote mark, the user gets a 404 error. What I’d like to do is use an Apache .htaccess RewriteRule to strip the quote mark and redirect the user to the proper page.

Try as I might, I just can’t seem to lock on to the quote mark and strip it out.

Any help would be appreciated.

Ken,

There is a MAJOR problem dealing with double quotes (Excluded US-ASCII Characters):

That said, so are spaces and there IS a way to deal with them:

regex = [\\ ]

Yes, the space character, represented by %20 in a URI, can be matched by including an escaped space within a character range definition. I’d have to guess that you can treat the ’ {double quote} the same way. Therefore, try:

RewriteEngine on
RewriteRule ^(.*)[\\"](.*)$ $1$2 [R=301,L]

Regards,

DK

I tried [\"] earlier in the day before posting this message and it did not work. I tried every way I could think of to pick off the quote via .htaccess and could not find any method that worked. I must have spent three or four hours googling this issue and found nothing of help. I finally gave up on the .htaccess route and use a php instruction to strip out quotes from a URI and then redirect to the proper page using a 301 redirect. It works, but I wish I had been able to find a .htaccess solution.

if(strpos($REQUEST_URI,'"')>0 || strpos($REQUEST_URI,'%22')>0){
	$domain="http://example.com
	$REQUEST_URI=$_SERVER['REQUEST_URI'];
	$REQUEST_URI=str_replace("%22",'"',$REQUEST_URI);
	$aURI=explode('"',$REQUEST_URI);
	header("HTTP/1.1 301 Moved Permanently");
	$uri=trim($aURI[0]);
	header("Location: $domain$uri");
	exit();
	}

The idea with the above code is that it strips off everything after the quote mark including the quote as often times webmasters forget the first quote on a link but get the second, which causes much of the other link detritus to be passed through as part of the URL.

Now if there were just a way to pick of a solo % sign as these or a % sign followed by two characters that are not a valid hexadecimal character code will throw a 400 Bad Request error and I can’t use a custom 400 error file to fix URIs with these issues.

It is amazing how many ways other webmasters can create broken or malformed links to other people’s sites. My goal is to find ways to fix as many of these bad links to my site via 301 redirects to the proper page as I can to both save the link juice and to get users to the page their wanted to begin with.