I am working on a site that gets its content from a CMS that in some cases outputs an empty P tag. Being hard to predict when such stray tags would occur, I thought I would do a simple str_replace() of the output string before echoing it. This should have been a no sweat task, but maybe am just cold and tired at the moment.
$output=str_replace(“<p> </p>”,“”,$output); fails to find any matches. At first I thought that my needle was wrong, but I did try “<p></p>” and also I looked at the output source code for the page, and copied “<p> </p>” directly from it.
testing to see that i didnt make a typo somewhere else in the code:
$output=str_replace(“e”,“xx”,$output); changes EVERY “e” in output to “xx”…
the only thing I can figure is that the string I am looking for is “<p> </p>” and that blank space is doing something to my search…
But I can figure out how to fix that on a str_replace
any suggestion would be greatly appreciated
I suspect you’re thinking that the opening and closing P tags have nothing between them because if they do have stuff between them str_replace will FAIL.
It will only find EXACTLY what you tell it - it will not look for <p> </p> and ignore everything between. It will look for exactly what you tell it so <p> </p> is NOT the same as <p>some content</p> and therefore it will fail.
You have 2 options:
- regular expressions - to use pattern matching and replacing
- Two seperate str_replace operations - one to replace <p> and one to replace <p/>.
well looking at the HTML output the empty tags are this:“<p> </p>” so if it’s empty ita blank or white space between them. I copied my needle string EXACTLY from the hTML output which is why this is baffling me…
I am afraid that replacing then <p> and </p> separately will also remove <p> tags that have content …
You are right about the str-replace finding EXACTLY what it’s told to look for. if i go into the text editor and mark all the blank Ps with a dash (<p>-</p>) they are removed. So my guess is I am looking at a blank character, that is counted as content, but that shows up as a single space (" ") in source code…
Ok, try another angle of attack… use single quotes in your str_replace for ‘<p> </p>’ instead of doubles. It’s possible php is being a bit odd with doubles and ignoring the space.
Thanks for your help. I tried your suggestion but it didnt work. I could not see why it , or anything else I have tried would not work… so I did a wild experiment… str_replace(" "…)
of course everything in the page became a BIG LONG break less word… looking at the source code… guess what wasn’t changed! the “<p> </p>” They stuck out like sore thumbs. My conclusion is that tinyMCE is putting some character other than " " between those <p> tags.
now I just have to hunt down exactly what that character is
Load the file into a hex viewer to see exactly whats in there. if the content of the <p> tag was imported from a UTF file you could have UTF bom’s in there.
How are you copying your html output? If you are viewing the source using your browser be aware that some browsers will rewrite the ACTUAL html content when you view source - most notably Firefox.
At first I copied the output from tinyMCE ( on FF) then, when that didn’t work from the FF view source… so I will try other browsers. Still what’s odd is that when i did a STR_REPLACE for simply " " it affected all of the other " " on the page EXCEPT the one in the blank Ps so I am still wondering if some other character out there LOOKS liek a singe " " but it’s not…