SitePoint Sponsor

User Tag List

Results 1 to 14 of 14

Hybrid View

  1. #1
    SitePoint Enthusiast jcwacky's Avatar
    Join Date
    Sep 2007
    Location
    Edinburgh, Scotland
    Posts
    27
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    How do you all go about converting special characters to HTML?

    How do you all go about converting special characters such as '&' and '' etc into their HTML entities; & and £ etc?

    I code in Coda, and regularly copy and paste blocks of text that clients have provided me from Word or an e-mail into the HTML, I then have to go through it and manually change each occurrence of a & or into it's special character. I know find and replace is an option, but surely there is some kind of way that automatically converts the characters upon copying the text?

    How do all you's go about doing this?

    Many Thanks
    James

  2. #2
    Nicking the Bevel Highway Seven's Avatar
    Join Date
    Nov 2008
    Location
    The Open Road
    Posts
    350
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I just use & or &#xxxx; (where xxxx is the ascii code).
    Daniel

  3. #3
    SitePoint Enthusiast jcwacky's Avatar
    Join Date
    Sep 2007
    Location
    Edinburgh, Scotland
    Posts
    27
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks, but when you have a large block of text which needs all the & and £ characters replacing with &amp etc, how do you go about it?

  4. #4
    Nicking the Bevel Highway Seven's Avatar
    Join Date
    Nov 2008
    Location
    The Open Road
    Posts
    350
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by jcwacky View Post
    Thanks, but when you have a large block of text which needs all the & and characters replacing with &amp etc, how do you go about it?
    Ah, you are talking about taking existing content and formatting it for the web.

    In the situations where I've had to do this, I just did it all manually. Not sure if there's an automated method.
    Daniel

  5. #5
    SitePoint Author silver trophybronze trophy

    Join Date
    Nov 2004
    Location
    Ankh-Morpork
    Posts
    12,158
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    If at all possible I try to use a character encoding that doesn't require any characters to be escaped. Like UTF-8.

    Otherwise I escape them as I type them in. I never have to copy text from somewhere else and escape characters afterwards, but if I did, I'd just use the replace feature in my editor (Vim).
    Birnam wood is come to Dunsinane

  6. #6
    In memoriam gold trophysilver trophybronze trophy Dan Schulz's Avatar
    Join Date
    May 2006
    Location
    Aurora, Illinois
    Posts
    15,476
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    It's best to learn them (and like Tommy, I also use UTF-8 almost exclusively when possible); the best way to learn is to actually use it while having a reference like this one handy: http://leftlogic.com/lounge/articles/entity-lookup/

    Edit:

    Oops, wrong link, but still pretty darn handy. Back to my mountain of bookmarks to dig out the right one I go...

  7. #7
    Nicking the Bevel Highway Seven's Avatar
    Join Date
    Nov 2008
    Location
    The Open Road
    Posts
    350
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Tommy and Dan — what's the benefit of using UTF-8?
    Daniel

  8. #8
    SitePoint Author silver trophybronze trophy

    Join Date
    Nov 2004
    Location
    Ankh-Morpork
    Posts
    12,158
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Highway Seven View Post
    Tommy and Dan — what's the benefit of using UTF-8?
    US-ASCII can only represent 95 printable characters.
    ISO 8859-1, Windows-1252 and similar add a few, but the number is still less than 250.
    UTF-8 can represent any character in the ISO/IEC 10646 repertoire (basically equivalent to Unicode).

    If you use US-ASCII or ISO 8859-1 you can't represent, e.g., the typographically correct quotation marks literally. You'll have to use entity references (“, ”) or numeric character references (“, ”). With UTF-8 you can just enter those characters as-is (“...”), which saves a bit of bloat, but is also far less error prone.
    Birnam wood is come to Dunsinane

  9. #9
    Nicking the Bevel Highway Seven's Avatar
    Join Date
    Nov 2008
    Location
    The Open Road
    Posts
    350
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by AutisticCuckoo View Post
    US-ASCII can only represent 95 printable characters.
    ISO 8859-1, Windows-1252 and similar add a few, but the number is still less than 250.
    UTF-8 can represent any character in the ISO/IEC 10646 repertoire (basically equivalent to Unicode).

    If you use US-ASCII or ISO 8859-1 you can't represent, e.g., the typographically correct quotation marks literally. You'll have to use entity references (“, ”) or numeric character references (, ). With UTF-8 you can just enter those characters as-is (...), which saves a bit of bloat, but is also far less error prone.
    Thanks Tommy, but I'm not sure if I fully understand what you're saying (thought I'd like to).

    How are you bypassing typing the UTF-8 code and just putting the quotation marks (or other symbol) directly in the HTML, as is?
    Daniel

  10. #10
    SitePoint Author silver trophybronze trophy

    Join Date
    Nov 2004
    Location
    Ankh-Morpork
    Posts
    12,158
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Highway Seven View Post
    Thanks Tommy, but I'm not sure if I fully understand what you're saying (thought I'd like to).

    How are you bypassing typing the UTF-8 code and just putting the quotation marks (or other symbol) directly in the HTML, as is?
    Just type them in, exactly like any other character. Of course, your editor must provide a means to input those characters (e.g., by entering the code position in decimal or hexadecimal). You can also copy the characters from somewhere else and paste them in.
    Birnam wood is come to Dunsinane

  11. #11
    In memoriam gold trophysilver trophybronze trophy Dan Schulz's Avatar
    Join Date
    May 2006
    Location
    Aurora, Illinois
    Posts
    15,476
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    This article will tell you. (I'm not going to let Tommy re-tell the story he already told in a featured article.)

    http://www.sitepoint.com/article/gui...cter-encoding/

  12. #12
    Nicking the Bevel Highway Seven's Avatar
    Join Date
    Nov 2008
    Location
    The Open Road
    Posts
    350
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Dan Schulz View Post
    This article will tell you. (I'm not going to let Tommy re-tell the story he already told in a featured article.)

    http://www.sitepoint.com/article/gui...cter-encoding/
    I guess I should have clarified my question: what is the benefit of using UTF-8 over ASCII coding (&#xxxx; ). I didn't see any argument supporting UTF-8 as the preferred method in the article, it seemed to support both methods equally.
    Daniel

  13. #13
    SitePoint Enthusiast
    Join Date
    Nov 2008
    Posts
    65
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Highway Seven View Post
    what is the benefit of using UTF-8 over ASCII coding (&#xxxx; ). I didn't see any argument supporting UTF-8 as the preferred method in the article, it seemed to support both methods equally.
    If you end up with a lot of escapes in your code, your source can become more difficult to read and edit. It's much easier to edit when you can directly see all the characters.

    On the other hand, if you don't use any escapes, you'll need to find a convenient way to insert characters you can't type with a keyboard.

    I also encode my documents in UTF-8, but if I only need to insert a character or two that is outside of ASCII, I may just use an escape because it's very easy to type in and will not muck up my source.

  14. #14
    SitePoint Guru glenngould's Avatar
    Join Date
    Nov 2005
    Posts
    661
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    In mac, it's very handy to be able to type “ ” © etc. directly from the keyboard. Is there a way I'm not aware of in Windows (a custom keyboard / utility)?
    Tweep List adds an avatar menu to Twitter (open source)
    Word Stats shows your most used words on Twitter


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •