SitePoint Sponsor

User Tag List

Results 1 to 3 of 3
  1. #1
    SitePoint Addict flyingpylon's Avatar
    Join Date
    Mar 2002
    Location
    Fishers, IN USA
    Posts
    256
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Converting Smart Quotes, etc.

    I've got a situation where I need to be able to replace "smart quotes" and similar characters that are created (presumably) by Microsoft Word. If you don't already know, smart quotes are those "curly quotes" that curl or angle in toward the text on each side. Here is an example, though I'm not sure how it will display in a browser:

    I thought I had just used a standard replace function in the past for these things, and that seemed to work when the text was coming from a form. However, now I am receiving them from a scraped web page, which I have retrieved using ServerXMLHTTP. In this case, a standard replace is not working.

    I've noticed that when I use Server.HTMLEncode on the text, both left and right smart quotes, angled apostrophes, em dashes, etc. all get encoded as &#65535 which tells me something goofy is going on. I don't know if this is a character encoding issue or what. I could just HTMLEncode the string and do a replace on &#65535 but that also encodes other things that I don't want encoded, and there appears to be no such thing as Server.HTMLDecode in classic ASP.

    Can anyone offer any advice?

  2. #2
    SitePoint Addict flyingpylon's Avatar
    Join Date
    Mar 2002
    Location
    Fishers, IN USA
    Posts
    256
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    After further investigation, it appears that the problem is indeed related to character encoding. The web page I'm grabbing is ISO-8859-1, while ServerXMLHTTP automatically uses UTF-8. It is possible to set the codepage used by ServerXMLHTTP, but it's not possible to set it to ISO-8859-1.

    So, I think I'm just stuck. I will probably have to code a lot of crazy workarounds to get the results I want.

  3. #3
    Drupaler bronze trophy greg.harvey's Avatar
    Join Date
    Jul 2002
    Location
    London, UK
    Posts
    3,258
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Use UTF-8 for everything and it should be fine.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •