SitePoint Sponsor |
|
User Tag List
Results 1 to 12 of 12
Thread: regexp question
-
Apr 15, 2009, 06:26 #1
- Join Date
- Jan 2005
- Location
- blahblahblah
- Posts
- 1,447
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
regexp question
Hi,
I'd like to do the following: explode an html document using all the occurences of tag. If [tag] appears 12 times, array will contain 12 elements.
Here's what I'm trying to do:
PHP Code:$blocks = preg_split("/<$tag(.)+>/", $html);
Regards,
-jj.
-
Apr 15, 2009, 06:29 #2
-
Apr 15, 2009, 06:35 #3
- Join Date
- Jan 2005
- Location
- blahblahblah
- Posts
- 1,447
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
It doesn't take into account the fact that there may be 'class="blah"' and 'id="foo"' within the html tag.
-
Apr 15, 2009, 06:54 #4
- Join Date
- May 2006
- Location
- Lancaster University, UK
- Posts
- 7,062
- Mentioned
- 2 Post(s)
- Tagged
- 0 Thread(s)
Have you tried:
PHP Code:$blocks = preg_split("/<{$tag}([^>]*)>/", $html);
Jake Arkinstall
"Sometimes you don't need to reinvent the wheel;
Sometimes its enough to make that wheel more rounded"-Molona
-
Apr 15, 2009, 09:11 #5
- Join Date
- Jan 2005
- Location
- blahblahblah
- Posts
- 1,447
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
It didn't work...
I changed my approach a little. I'm trying to find a reg exp that could do the following (please consider this string):
HTML Code:id="container">
-
Apr 15, 2009, 09:35 #6
- Join Date
- May 2006
- Location
- Lancaster University, UK
- Posts
- 7,062
- Mentioned
- 2 Post(s)
- Tagged
- 0 Thread(s)
Piece of advise that SilverBullet is always giving people doing the same thing as you - if you want to find elements etc in HTML or XML, utilise the DOM!
I really need to delve into programming with the DOM sometime soon...Jake Arkinstall
"Sometimes you don't need to reinvent the wheel;
Sometimes its enough to make that wheel more rounded"-Molona
-
Apr 15, 2009, 09:42 #7
- Join Date
- Apr 2008
- Location
- North-East, UK.
- Posts
- 6,111
- Mentioned
- 3 Post(s)
- Tagged
- 0 Thread(s)
*Runs in, sticks chest out and looks to the sky in a Superman-esk way*
Someone mention my name and DOM?@AnthonySterling: I'm a PHP developer, a consultant for oopnorth.com and the organiser of @phpne, a PHP User Group covering the North-East of England.
-
Apr 15, 2009, 09:50 #8
- Join Date
- Jul 2008
- Posts
- 5,757
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
rofl
-
Apr 15, 2009, 09:55 #9
- Join Date
- Apr 2008
- Location
- North-East, UK.
- Posts
- 6,111
- Mentioned
- 3 Post(s)
- Tagged
- 0 Thread(s)
PHP Code:<?php
$sSomeDocument = '
<rootNode>
<tag class="" atrrib="">One</tag>
<tag atrrib="">Two</tag>
<tag id="" atrrib="">Three</tag>
<tag id="" class="" atrrib="">Four</tag>
<tag>Five</tag>
<nested>
<tag>Six</tag>
<tag id="" class="" atrrib="">Seven</tag>
<tag id="" atrrib="">Eight</tag>
<tag id="" class="" atrrib="">Nine</tag>
<tag id="">Ten</tag>
</nested>
</rootNode>
';
$oDOM = new DOMDocument();
$oDOM->loadXML($sSomeDocument);
foreach($oDOM->getElementsByTagName('tag') as $oNode)
{
echo $oNode->nodeValue . '<br />';
}
/*
One<br />
Two<br />
Three<br />
Four<br />
Five<br />
Six<br />
Seven<br />
Eight<br />
Nine<br />
Ten<br />
*/
?>@AnthonySterling: I'm a PHP developer, a consultant for oopnorth.com and the organiser of @phpne, a PHP User Group covering the North-East of England.
-
Apr 16, 2009, 05:31 #10
- Join Date
- Jan 2005
- Location
- blahblahblah
- Posts
- 1,447
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
cool! sounds great
However, I get this error :
Warning: DOMDocument::loadHTML() [function.DOMDocument-loadHTML]: Input is not proper UTF-8, indicate encoding !
-
Apr 16, 2009, 08:13 #11
- Join Date
- Apr 2008
- Location
- North-East, UK.
- Posts
- 6,111
- Mentioned
- 3 Post(s)
- Tagged
- 0 Thread(s)
Do you have any incorrectly encoded characters? Euro sign, Pound sign etc...?
@AnthonySterling: I'm a PHP developer, a consultant for oopnorth.com and the organiser of @phpne, a PHP User Group covering the North-East of England.
-
Apr 17, 2009, 03:30 #12
- Join Date
- Jan 2005
- Location
- blahblahblah
- Posts
- 1,447
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
I had. I have now fixed it by making sure my $html string is utf-8 encoded.
However, I am now facing another problem. Please consider the following html code:
PHP Code:<div id="some-id">
<div>
Welcome
<div>
</div>
And also, if I print_r($oDOM->getElementsByTagName('div')), I can't get an array to be displayed. How could I do this?
Regards,
-jj.Last edited by jjshell; Apr 17, 2009 at 04:16.
Bookmarks