Getting Tidy to work in PHP 4.3x
Just a quick note to those trying to get Tidy to work in PHP 4.3x.
1. Get libtidy source from http://tidy.sourceforge.net/src/tidy_src.tgz.
2. Tidy is currently available for PHP 4.3.x and PHP 5 as a PECL extension from http://pecl.php.net/package/tidy. You can download it directly from http://pecl.php.net/get/tidy-1.0.tgz. Run the following commands to unpack and install:
tar -zvxf tidy_src.tgz
3. Then add
tar -zvxf tidy-xxx.tar
./configure && make && make install
to your php.ini file and restart Apache.
You should now see Tidy in the phpinfo();
Functions like tidy_get_html which return a TidyNode Object which you can traverse are not available in PHP 4.3.x, only PHP 5.
Since tidy_node is only available in PHP >= 5.0.0, the only useful functions I can see are:
html = tidy_get_output();
The reason for this post is just a suggestion really. It might be a good idea to tidy up all the HTML with Tidy before running a parser over it. Especially if you are building a DOM like tree using a SAX parser.