i'd like to parse the dtd's of the various htmls. parse the dtd into a graph (nodes kind of graph) to be able to make use of the contents of the dtd programmatically.
i understand that dtd's can be in xml or sgml. sgml is a superset of, contains, xml. and html dtds are in sgml. possibly xhtml's dtd is in xml rather than sgml, not sure. but seeing as html's dtd is sgml it doesn't make any difference because html's dtd is sgml an xml parser isn't going to cut it. which is a shame because i'm quite sure there's plenty of xml parsers, written in php, already available. but what about an sgml parser written in php? does such a thing already exist? available for free?
Bookmarks