var innerHTML = "<strong><span id="cke_newspan_1508682170249551" style="color:#F9E500">a</span></strong><br>"
I need to match the letter ‘a’ in the middle. the letter can sometimes be a number as well, and is sometimes followed by either and . or a ) and then more text. there can sometimes be multiple span tags or other kinds of tags encasing this string as well.
for some reason a simple
/>([A-z])+\</.test( innerHTML );
won’t even return true. Any help would be appreciated. Thanks
Ok so after 4 hours of pulling my hair out I tried pasting the innerHTML string output from chrome console into Dreamweaver and it wouldn’t save, saying there are some characters not encoded. Strange, I thought. So i ran innerHTML through
var bytelike= unescape(encodeURIComponent(innerHTML));
var innerHTML= decodeURIComponent(escape(bytelike));
then logged innerHTML to the console again and sure enough, lots of strange characters. I don’t know much at all about encoding but i found
and this seems to have solved my problem, now the RegExps are working as they’re supposed to. the innerHTML output comes from a rich text editor called CKEDITOR which I thought i had mostly figured out but this was a real curveball. Thanks for your help though
Well, hex x00 to x7F are the common Ascii characters
and that replace regex is removing characters that are beyond that.
Thing is, I’m not seeing any characters in that string that aren’t single byte so I don’t understand why that regex would be needed.
But it does sound like there is an encoding conflict somewhere. IMHO, the best way to avoid encoding problems is to make sure you have UTF-8 without BOM everywhere. (* assuming you don’t need higher)
Your text editor, HTTP headers, meta tags, database characters and collation, essentially everywhere you can specify it.