If you ignore the fact that the output is not an actual HTML page (no <html> <body> etc etc) and the actual coding is pretty shoddy, but this is a night's worth of having a shot at it without using regex ...
it only grabs what is inbetween any < or > 's and outputs each on a new line, indicating whether it is an opening or closing tag
known bugs:
- anything not inside <> does not get displayed at all
- tags that dont need a closing tag are listed as opening tags regardless
- closing tags that have whitespace between the < and / are listed as an opening tag (something ill need a regexp for..?)
- if there is a stray < or >, then it may falter and show everything upto the next > as part of a tag
==> possible workaround/fix would be to keep a list of recognised tags and display unknown code into a "code" heading
- not really a bug, but the attributes of the tags are included with the tag name
- this isnt really a bug, but some of the variable names aren't particularly descriptive - i was just trying to bash something out quickly.. comments were added afterwards
there are a few other problems but i cant remember them offhand but does it give you an idea of what i want to do?
and this is swinging between being a "training exercise" as you put it, and a potential implentation.. a contact of mine setup a team (worldwide) to do this approx 2 months ago but since then, other commitments got in the way and we've now disbanded but i want to continue with it ...
... so here i am =]
anyways.. this is what ive got so far .. needs a lot of work ..
oh, and as for php 5, idk if ill be able to ... i chose badly when choosing a reseller acc. and I'm just contacting them now to request an upgrade, quoting from php.net
we'll see..
regards,
kwah
PHP Code:
<?php
// the location of the file being edited
$source = file_get_contents("./source.html");
// un-needed but will be useful in future when applying char encoding etc
$mystring = $source;
// used when displaying the content on page
$mystring2 = htmlentities($mystring, ENT_QUOTES);
// display the source code being edited - has been somewhat sanitised (converted to entities) so should be safe
echo "<pre>",$mystring2,"</pre><br><br>\n\n";
// search terms .. may be added to later..
$findme = '<';
$findme2 = '>';
echo "<br><br>\n\n";
echo "The text between these tags is:<br>\n";
// indiator of what type of tag is being used - opening, closing, self-closing or n/a
$marker_opening = "O";
$marker_closing = "C";
//// TODO: add checks for if it is a self-closing tag (img for example)
$marker_self = "S";
//// .... or doesn't need one (text for example)
$marker_na = "N";
$i = 0;
$tyui = 1;
// count number of <'s as guide to number of tags to expect
$findme_count = substr_count($mystring,$findme);
$max = $findme_count;
// loop --> search for a <, search for the next > and display whatever is inbetween
do {
echo "<br>\n\n";
//////////////////// currently unused ///////////////////////
// search for the position of the next (first) <
$pos = strpos($mystring, $findme,$i);
// search for the position of the next (first) >
$pos2 = strpos($mystring, $findme2, $pos);
////////////////////////// end //////////////////////////////
// explode the source into sections based on <
//// NOTE: the explode 'term' / delimiter is removed
$content = explode($findme,$mystring);
// explode the remaining sections into further breakdowns based on >
$content2 = explode($findme2,$content[$tyui]);
//// TODO: show instances of text outside of a tag as 'Text'
//// TODO: list included attributes beneath each tag name
// displayed snippets will be surrounded by <pre></pre> tags
echo "<pre>";
// check if the 'grabbed' code is a closing tag - ie, if it contains a / directly after the < then it will be considered as a closing tag
//// TODO: use regex (?) to force it to work regardless of whitespace
$isclosingtagcheck=substr($content2[0], 0, 1);
// perform v.simple check and display the coded version accordingly
if($isclosingtagcheck=='/'){
// if it is a closing tag, start from char # 1 (counting starts @ 0)
echo $marker_closing,":",substr($content2[0], 1);
}else{
echo $marker_opening,":",$content2[0];
}
echo "</pre>";
// make the next loop start checking where the previous tag supposedly finished
$i=$pos2;
// use $tyui as a counter, comparing the number of loops done with the expected total of tags
$tyui++;
} while ($tyui<=$max);
?>
Bookmarks