What happens to the web page's text?

I listen to the radio show “In The Studio” and I like to save the “Bio” that they post each week for that week’s show.

The weird thing is that if I select the Bio text and paste it into a text editor the text gets all messed up - particularly the periods and commas?!

Here is this week’s show…

In The Studio

And here is what they text looks like after I pasted it into TextEdit on my MacBook…

In The Studio With
James Gang-Rides Again

It was the reason I hitched a ride with friends in 1971 to some small Central Ohio college to sit in the dirt infield of the indoor track fieldhouse : to see & hear Cleveland/Akron band The James Gang on a low riser stage , the spotlight reflecting blindingly off the guitar of singer Joe Walsh . Up until then , we had heard radio ads on Akron station WHLO most weekends inviting the public to see the band at an Akron-area high school dance for 50 cents . Precious few outside the Northeast Ohio Cleveland-Youngstown-Akron triangle had purchased their first album , but their follow-up Rides Again was both a critical and popular success . Sure , radio stations then and now play “Funk #49″ ( yep , there’s a “Funk #48″ on their debut , Yer Album ) , but songs like ” Woman” ” and “The Bomber” influenced American hard rock well into the 1980s , and “Tend My Garden” , “There I Go Again” , and the melancholy “Ashes , the Rain, and I “ are all surprisingly timeless four decades later . Joe Walsh is my guest for the hour .

-Redbeard

Notice how a space gets added on each side of commas and periods, as well as parentheses and sometimes quotes.

This may seem like an insignificant thing, but it drives me CRAZY because I have to go edit all of the text each week so it is readable! :mad:

What is going on?! (I do this all the time all of the Internet and have never seen such peculiar behavior?! :-/

Thanks,

Debbie

It appears that someone has used unicode characters to generate this text. If you copy it into notepad and try to save it it warns you that unicode is present. I saved two copies of this code; one in its unicode format and another in ANSI format. I then examined the hex dump of each copy and found that the unicode version has character 00 between each of the alpha characters. When the same text is saved in ANSI format the 00 characters disappear, except around the commas.

I have attached the hex dump for each version for you to have a look at (check1.jpg=unicode; check2.jpg=ANSI).

The quickest way I can see of altering the spacing is doing a global search and replace for a >space comma< combination and replacing it with
>no-space comma<.

How did they do that?

Is that right or wrong?

Is there something with the PHP, HTML, whatever they used to build the web page?

Could it be an XML or database issue?

If you copy it into notepad and try to save it it warns you that unicode is present.

Well, I am on a Mac and I save things in TextEdit as Unicode (UTF-8) by default.

I don’t see that error. (Maybe you are on Windows?)

I saved two copies of this code; one in its unicode format and another in ANSI format. I then examined the hex dump of each copy and found that the unicode version has character 00 between each of the alpha characters.

Those are pretty cool screen shots you’ve attached! :smiley:

How in the world did you do that?!

You must be really smart…

When the same text is saved in ANSI format the 00 characters disappear, except around the commas.

I have attached the hex dump for each version for you to have a look at (check1.jpg=unicode; check2.jpg=ANSI).

I tried saving a TextEdit file - with the text - as both “Western (Windows Latin 1)” and “Western (ASCII)”, but OS X won’t let me do that for some reason? :-/

(I want to stress again, that this is the ONLY website I have ever had issues copying and pasting text from a web page into a text file. Furthermore, the text in said text files has never lost any formatting.)

The quickest way I can see of altering the spacing is doing a global search and replace for a >space comma< combination and replacing it with
>no-space comma<.

That’s too much work, and the text is getting screwed up in a non-standard way so that it really requires going over every sentence.

I’m trying to trouble-shoot this so I can contact the webmaster and tell them how to fix their dumb website! :mad:

Debbie

Perhaps it’s as simple as the person typing in the text typed in the space before the comma!

I would say that is due to unknown characters or a different character encoding type

No, if you look at the whole bio it is clear there is some computer issue. (No one is that retarded that they would type like that!)

Debbie

Agreed.

So if you were going to contact the webmaster, what IT insight would you add in?

(Sometimes people aren’t too swift. I mean who would have a production website with such sloppy errors and not even catch it?!)

Debbie

They’ll possibly ignore you since it is obvious that either; they don’t care or haven’t noticed the fault themselves, and if it is a CMS it might not be the original person doing the editing.

Therefore you’d probably need a screenshot as supportive evidence albeit it is quite obvious if you view it in a modern web browser or do a copy-and-paste you’ll see the slipshod grammar formatting, etc.

More crap on the Internet (and people miraculously with jobs creating more crap for the Internet)?! :rolleyes:

Well, at least it’s nothing that “I” did?!

Debbie

No, nothing you did. It’s in the source that way. View source, and it becomes obvious.

Off Topic:

Ah, Redbeard, a name from out of the past. In the early to mid eighties, he took over as program director at KTXQ, a really great album-rock FM station with high ratings here in Dallas; one that played old as well as new artist’s albums, as long as they were somehow special. He quickly turned the station’s genre into pabulum rock that drew mediocre ratings, with an overflow of name dropping during his own show.

cheers,

gary

Yeah. I e-mailed Redbeard (the website), so we’ll see if they care enough to respond (and fix it)?! :-/

[ot]

Off Topic:

Ah, Redbeard, a name from out of the past. In the early to mid eighties, he took over as program director at KTXQ, a really great album-rock FM station with high ratings here in Dallas; one that played old as well as new artist’s albums, as long as they were somehow special. He quickly turned the station’s genre into pabulum rock that drew mediocre ratings, with an overflow of name dropping during his own show.

Yeah, I heard about that. Seems like in this modern world that whenever a DJ does something that makes sense and people like, the execs yank it off.

Apparently “thinking music” and “diverse music” lost their appeal 20+ years ago… sigh

Since Redbeard is back on on the Internet, I record as much of his stuff as possible before he disappears forever?! :eek:

Rush is on this week!!! :weee:

If you want to times and places to catch the show, just PM me.[/ot]

cheers,

gary

Thanks,

Debbie

[ot]I think you misunderstood my point. Redbeard took over a station with an excellent playlist and a number one ranking, and drug it into mediocrity. I am not a fan.

I was a big fan of KXOL-AM in Fort Worth in '59-60. They had an evening DJ named George Carlin. Now that was an experience.[/ot]
cheers,

gary

Off Topic:

Well, I like his interviews of Classic Rock artists. Sorry you don’t like him.

Debbie

Well, I finally figured out what was wrong with the “In The Studio” website.

Here is a response (via e-mail) from the ever-famous “Redbeard” himself…

I’m sure that’s nothing to do with computers , & everything to do with me as a “hunt-and-peck” two finger typist !
Boy , I was worried there for a day . Sorry , I never took typing & have had years to regret it . My wife will have a hoot over this !
Thanks for caring enough to write , sorry to cause you such grammatical distress .

The “real” Redbeard

That is TOO FUNNY!! :lol:

Sincerely,

Debbie

I notice one of my bosses always adds spaces between .'s and !'s (but not commas) so the news articles he writes always look a little retarded on our sites.

Nobody else seems bothered, so it’s just me being anal. But boy does it bug me!

Amen to that, sister!!! :lol:

Debbie

I’m the same … I am so used to going through and correcting my bosses’ punctuation before publishing documents, and I don’t mind that … but it irritates the wossnames out of me is when they bypass me completely and send stuff out riddled with punctuation mistakes.