SitePoint Sponsor

User Tag List

Results 1 to 17 of 17

Hybrid View

  1. #1
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,529
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)

    What happens to the web page's text??

    I listen to the radio show "In The Studio" and I like to save the "Bio" that they post each week for that week's show.

    The weird thing is that if I select the Bio text and paste it into a text editor the text gets all messed up - particularly the periods and commas?!

    Here is this week's show...

    In The Studio

    And here is what they text looks like after I pasted it into TextEdit on my MacBook...

    In The Studio With
    James Gang-Rides Again

    It was the reason I hitched a ride with friends in 1971 to some small Central Ohio college to sit in the dirt infield of the indoor track fieldhouse : to see & hear Cleveland/Akron band The James Gang on a low riser stage , the spotlight reflecting blindingly off the guitar of singer Joe Walsh . Up until then , we had heard radio ads on Akron station WHLO most weekends inviting the public to see the band at an Akron-area high school dance for 50 cents . Precious few outside the Northeast Ohio Cleveland-Youngstown-Akron triangle had purchased their first album , but their follow-up Rides Again was both a critical and popular success . Sure , radio stations then and now play “Funk #49″ ( yep , there’s a “Funk #48″ on their debut , Yer Album ) , but songs like ” Woman” ” and “The Bomber” influenced American hard rock well into the 1980s , and “Tend My Garden” , “There I Go Again” , and the melancholy “Ashes , the Rain, and I “ are all surprisingly timeless four decades later . Joe Walsh is my guest for the hour .

    -Redbeard
    Notice how a space gets added on each side of commas and periods, as well as parentheses and sometimes quotes.

    This may seem like an insignificant thing, but it drives me CRAZY because I have to go edit all of the text each week so it is readable!

    What is going on?! (I do this all the time all of the Internet and have never seen such peculiar behavior?!

    Thanks,


    Debbie

  2. #2
    SitePoint Addict AllanP's Avatar
    Join Date
    Sep 2010
    Location
    Australia
    Posts
    286
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    It appears that someone has used unicode characters to generate this text. If you copy it into notepad and try to save it it warns you that unicode is present. I saved two copies of this code; one in its unicode format and another in ANSI format. I then examined the hex dump of each copy and found that the unicode version has character 00 between each of the alpha characters. When the same text is saved in ANSI format the 00 characters disappear, except around the commas.

    I have attached the hex dump for each version for you to have a look at (check1.jpg=unicode; check2.jpg=ANSI).

    The quickest way I can see of altering the spacing is doing a global search and replace for a >space comma< combination and replacing it with
    >no-space comma<.
    Attached Images Attached Images

  3. #3
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,529
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by AllanP View Post
    It appears that someone has used unicode characters to generate this text.
    How did they do that?

    Is that right or wrong?

    Is there something with the PHP, HTML, whatever they used to build the web page?

    Could it be an XML or database issue?


    If you copy it into notepad and try to save it it warns you that unicode is present.
    Well, I am on a Mac and I save things in TextEdit as Unicode (UTF-8) by default.

    I don't see that error. (Maybe you are on Windows?)


    I saved two copies of this code; one in its unicode format and another in ANSI format. I then examined the hex dump of each copy and found that the unicode version has character 00 between each of the alpha characters.
    Those are pretty cool screen shots you've attached!

    How in the world did you do that?!

    You must be really smart...


    When the same text is saved in ANSI format the 00 characters disappear, except around the commas.

    I have attached the hex dump for each version for you to have a look at (check1.jpg=unicode; check2.jpg=ANSI).
    I tried saving a TextEdit file - with the text - as both "Western (Windows Latin 1)" and "Western (ASCII)", but OS X won't let me do that for some reason?

    (I want to stress again, that this is the ONLY website I have ever had issues copying and pasting text from a web page into a text file. Furthermore, the text in said text files has never lost any formatting.)


    The quickest way I can see of altering the spacing is doing a global search and replace for a >space comma< combination and replacing it with
    >no-space comma<.
    That's too much work, and the text is getting screwed up in a non-standard way so that it really requires going over every sentence.

    I'm trying to trouble-shoot this so I can contact the webmaster and tell them how to fix their dumb website!



    Debbie

  4. #4
    SitePoint Addict AllanP's Avatar
    Join Date
    Sep 2010
    Location
    Australia
    Posts
    286
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Perhaps it's as simple as the person typing in the text typed in the space before the comma!

  5. #5
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,529
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by AllanP View Post
    Perhaps it's as simple as the person typing in the text typed in the space before the comma!
    No, if you look at the whole bio it is clear there is some computer issue. (No one is that retarded that they would type like that!)


    Debbie

  6. #6
    SitePoint Guru team1504's Avatar
    Join Date
    May 2010
    Location
    Okemos, Michigan, USA
    Posts
    732
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    I would say that is due to unknown characters or a different character encoding type

  7. #7
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,529
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by team1504 View Post
    I would say that is due to unknown characters or a different character encoding type
    Agreed.

    So if you were going to contact the webmaster, what IT insight would you add in?

    (Sometimes people aren't too swift. I mean who would have a production website with such sloppy errors and not even catch it?!)


    Debbie

  8. #8
    Robert Wellock silver trophybronze trophy xhtmlcoder's Avatar
    Join Date
    Apr 2002
    Location
    A Maze of Twisty Little Passages
    Posts
    6,316
    Mentioned
    60 Post(s)
    Tagged
    0 Thread(s)
    They'll possibly ignore you since it is obvious that either; they don't care or haven't noticed the fault themselves, and if it is a CMS it might not be the original person doing the editing.

    Therefore you'd probably need a screenshot as supportive evidence albeit it is quite obvious if you view it in a modern web browser or do a copy-and-paste you'll see the slipshod grammar formatting, etc.

  9. #9
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,529
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by xhtmlcoder View Post
    They'll possibly ignore you since it is obvious that either; they don't care or haven't noticed the fault themselves, and if it is a CMS it might not be the original person doing the editing.

    Therefore you'd probably need a screenshot as supportive evidence albeit it is quite obvious if you view it in a modern web browser or do a copy-and-paste you'll see the slipshod grammar formatting, etc.
    More crap on the Internet (and people miraculously with jobs creating more crap for the Internet)?!

    Well, at least it's nothing that "I" did?!


    Debbie

  10. #10
    Resident curmudgeon bronze trophy gary.turner's Avatar
    Join Date
    Jan 2009
    Location
    Dallas
    Posts
    990
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    No, nothing you did. It's in the source that way. View source, and it becomes obvious.

    Off Topic:

    Ah, Redbeard, a name from out of the past. In the early to mid eighties, he took over as program director at KTXQ, a really great album-rock FM station with high ratings here in Dallas; one that played old as well as new artist's albums, as long as they were somehow special. He quickly turned the station's genre into pabulum rock that drew mediocre ratings, with an overflow of name dropping during his own show.


    cheers,

    gary
    Anyone can build a usable website. It takes a graphic
    designer to make it slow, confusing, and painful to use.

    Simple minded html & css demos and tutorials

  11. #11
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,529
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by gary.turner View Post
    No, nothing you did. It's in the source that way. View source, and it becomes obvious.
    Yeah. I e-mailed Redbeard (the website), so we'll see if they care enough to respond (and fix it)?!


    Off Topic:

    Off Topic:

    Ah, Redbeard, a name from out of the past. In the early to mid eighties, he took over as program director at KTXQ, a really great album-rock FM station with high ratings here in Dallas; one that played old as well as new artist's albums, as long as they were somehow special. He quickly turned the station's genre into pabulum rock that drew mediocre ratings, with an overflow of name dropping during his own show.


    Yeah, I heard about that. Seems like in this modern world that whenever a DJ does something that makes sense and people like, the execs yank it off.

    Apparently "thinking music" and "diverse music" lost their appeal 20+ years ago... *sigh*

    Since Redbeard is back on on the Internet, I record as much of his stuff as possible before he disappears forever?!

    Rush is on this week!!!

    If you want to times and places to catch the show, just PM me.

    cheers,

    gary

    Thanks,



    Debbie

  12. #12
    Resident curmudgeon bronze trophy gary.turner's Avatar
    Join Date
    Jan 2009
    Location
    Dallas
    Posts
    990
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Off Topic:

    I think you misunderstood my point. Redbeard took over a station with an excellent playlist and a number one ranking, and drug it into mediocrity. I am not a fan.

    I was a big fan of KXOL-AM in Fort Worth in '59-60. They had an evening DJ named George Carlin. Now that was an experience.

    cheers,

    gary
    Anyone can build a usable website. It takes a graphic
    designer to make it slow, confusing, and painful to use.

    Simple minded html & css demos and tutorials


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •