I have a table with "news" in it. Some of the content contains links to other sites in the form of html markup. I have extracted the data into a text file, and I want to tidy it up so that it validates as xhtml strict.
I would like to be able to do an SQL to pull out the fldBody field. Then loop through the recordset, and use a regular expression to do some kind of find/replace routine to tidy up the HTML.
I know this is asking a lot!
So, if I had this for example, in the fldBody field
Here is some news about <a href=http://www.google.com>google</a> which I read today
It'd be good to find a way to convert it to:
Here is some news about <a href="http://www.google.com">google</a> which I read today
I have no idea if this can be done, and I am totally useless at Regular Expressions. I've spent ages looking at info about them on the web, but cannot work out how they work.
If pushing to achieve the impossible in ASP.
You would have to use components and have access to a dedicate server.
You can then use a combination of third party tools to tidy up things.
HTMLTidy is one of them and its open source.
There are probably others.
If the problem is only links (ie <a href=http://noquotes.com> -> <a href="http://noqueotes.com")
its easier to write down a setup of rules.
It is probably easier to do a series of tag inspections + VB search/replace
then using Regular Expression which I am not sure exists in Classical ASP anyway.
Bookmarks