SitePoint Sponsor

User Tag List

Results 1 to 19 of 19
  1. #1
    ********* Articles ArticleBot's Avatar
    Join Date
    Apr 2001
    Posts
    1
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    A Really, Really, Really Good Introduction to XML

    This is an article discussion thread for discussing the SitePoint article, "A Really, Really, Really Good Introduction to XML"

  2. #2
    SitePoint Wizard Young Twig's Avatar
    Join Date
    Dec 2003
    Location
    Albany, New York
    Posts
    1,355
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    That Dreamweaver image is a a bit screwed up. ;)

  3. #3
    gingham dress, army boots... silver trophy redux's Avatar
    Join Date
    Apr 2002
    Location
    Salford / Manchester / UK
    Posts
    4,838
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    as good as the introduction may be...what's with the self-congratulatory title of the article?

  4. #4
    Ned Collyer
    SitePoint Community Guest
    XML should be used for communicating data between systems, and XSLT should be used for converting any of that XML into other XML.

    I've done a lot of work on CMS's (approx 5000-6000 XSLT templated pages and sections per client) and other projects based on XML and XSLT.

    The XML files as a datastore is slow to query, clunky, does not scale well in complex system. IT IS NOT A DATABASE! (I've seen so many instances of XML pseudo filesystem database... its WRONG and will lead to issues)

    XML is designed to be easy to use (ie between systems), and easy to understand for a human.

    Databases are designed to store and retrieve data.

    XSLT is hard to debug (during development is relativly easy but when something goes wrong... BUP BOW)

    Also, when using XSLT for HTML output you become really limited in the functionality you can display. Simple dynamic functionality becomes a nightmare, simple upgrades become tedious.

    (for the record I've been training people in the use of XML and XSLT and XPATH ... in the same areas as described in this article for the past 2 years. I've also used it "successfully". Perhaps I'm just burnt out, but I really feel like there is overuse of these technologies in the wrong circumstances, and underuse where it would be appropriate.)

    XPath is probably the coolest tech part of XML / XSLT / DOM! Bit expensive, but very powerful.

    So... what am I saying? XML is not the answer for a "newbies" database, nor complex templated systems.

    Be very cautious about the requirements of the project your working on before diving head first into a neverending nightmarish maintenance headache.

    ps, Sitepoint, please make comments textbox bigger.

  5. #5
    Janne
    SitePoint Community Guest
    Great introduction, thanks.

    I don't know if it's on purpose or accidental but for example all the DTD excerpts on page 2 seem to be messing the lesser than, greater than symbols and exclamation marks of the ELEMENT and ATTLIST lines.

    Also in Example 4.2. chapter2xhtml.xsl the excerpt ends quite suddenly with the doctype-system attribute left empty.

  6. #6
    SitePoint Wizard REMIYA's Avatar
    Join Date
    May 2005
    Posts
    1,351
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    A great XML article. Now even beginners will have no more excuses ;)

    Just one point out: "XML4J has graduated. XML4J is now available as Apache's Xerces-J Parser."

  7. #7
    SitePoint Enthusiast
    Join Date
    Jul 2004
    Location
    Nottingham
    Posts
    85
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Ned Collyer
    XML should be used for ... neverending nightmarish maintenance headache.
    I can certainly sympathise with a lot of what you said Ned, but can you point out a better alternative than using xslt? I had been given the impression that xml/xsl was the way forward, not the other way round.

    In my mind I had this development system worked out:

    sql > php > xml + dtd > xsl[t] > xhtml > css > glorious output etc

    but you seem to not agree - I won't argue cos I don't know any better - but perhaps you could enlighten as to a better system? Are we all still headed towards seperation of a) business logic from presentation login and b) the resulting content from overall presentation [in terms of web application development]?

  8. #8
    Ramesh
    SitePoint Community Guest
    It is a good start for xml. But since you started with cms as main focus. You could have explained those in more detail. It also requires minimum deviation during explanation. But it is really a good attempt. Will be useful for xml beginners.

  9. #9
    SitePoint Member
    Join Date
    Aug 2005
    Posts
    2
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Arrow

    NikLP: I guess it really depends on the languages you are using, and the specific functionality of your system... and how picky your tech guys are



    Instead of:
    sql > php > xml + dtd > xsl[t] > xhtml > css > glorious output etc

    Consider:
    sql > php > php/xhtml > css > glorious output

    and to comunicate with other things:
    sql > php > php/xml > atom, rss, soap > glorious communication


    Why does presentation logic have to have its own language? To enforce seperation by incompatibility? I dont think so.


    this could just as easily be
    sql > .net > .net web controls/xhtml > css > glorious output

    or

    sql > hibernate > struts > tiles > css > glorious output.


    Keep the language count down.
    Keep the performance up.
    Keep the extensibility up (HAH i know its called extensible markup language.. but this isnt the right circumstance for it).

    Looking at the examples I've given, each could just as easily contain a DTD, XML and XSLT layer.. and you could argue that you could have that 3 language layer used by lots of different technologies and languages...

    and thats kind of my point. Its good for communication between systems, and pretty restrictive for internal workings.

    As for DTDs.. I agree with them. But I would use them for getting unknown unhandled data from a 3rd party, or validating something I'm sending to a 3rd party.

    PHP became popular because it was language inside a web page. And then people tried to be too clever without seperating the logic out of those pages which resulted in convoluted messes of crap that were hard to understand.

    Why not harness the reason original reason behind PHPs popularity with the new power more recent releases have given.

    Say you needed to write some inline javascript for some reason, or do something crazy with sorting, or for some reason you just need that ampersand to exist, or you needed to call on a formatting script...

    Its so much easier without that technology.



    Dunno where I'm going with this.... I'm not a professional writer and im procrastinating working with such a convoluted XML/XSLT beast heh

    I hope I've at least shed some light on my reasonings...

  10. #10
    SitePoint Member
    Join Date
    Aug 2005
    Posts
    2
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    btw, ikarys = Ned

  11. #11
    SitePoint Enthusiast
    Join Date
    Jul 2004
    Location
    Nottingham
    Posts
    85
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Aaaaand Ned = ? Sorry...

    I guess I kind of missed a part of my point out. If I want to develop a platform that will enable me to communicate with equal ease to both 'normal' pc users and also PDA users and the like, wouldn't I need to use XSL to do that best?

    Obviously I could trim parts of the site down using CSS, but doesn't that destroy the point, as one would still be sending the same amount of data - across a more expensive transfer medium - and then just wasting a lot of that data?

    Plus, without the XSL layer there, the presentation *logic* (HTML) is (mostly) contained within the PHP code, so if you wanted to create a bunch of shops using the same core code, you would have to ... well you couldn't, unless they all had the same exact layout as per the HTML produced in the presentation logic in the PHP files - which I understood should really only be used to write the business logic.

    Would another way around this be to use an included PHP template system? This is one of the things I am yet to fully investigate in programming. Obviously, in that instance, one would be able to use a different set of templates for each individual customer.

    It begs the question though, why consider XSL at all if people are happy without it (in instances like this). Also, given that I am aware of more than one supplier (Karova.com is a well known one) that is using systems of the nature I described in my initial post - what do they know that we don't?!

    Tell me more.

  12. #12
    SitePoint Member
    Join Date
    Aug 2005
    Posts
    2
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    If I want to develop a platform that will enable me to communicate with equal ease to both 'normal' pc users and also PDA users and the like, wouldn't I need to use XSL to do that best?
    The output is the output. PDAs, PC users whatever should not be exposed to the templates, or 'behind the scenes'. (Given the current state of the web + standards, I would stear clear of client side transforms)


    Would another way around this be to use an included PHP template system? This is one of the things I am yet to fully investigate in programming. Obviously, in that instance, one would be able to use a different set of templates for each individual customer.
    Thats what I was driving at. You dont have to use XSL to template. You could use anything.

    If you notice in my high level php outline, that php is mentioned twice.

    When people talk about programming and seperation, its seperation of logic. Not necesarily seperation of technologies or languages. If you want to see some good and interesting implementations of MVC, check out the templating side of Ruby on Rails.
    (I havnt checked Cake for PHP but im under the impression its very similar)

    What benefit does XSL bring to the table?
    I can think of... XPath (which totally rocks), but its really only useful if you don't have control over the data (ie, cant easily filter your data using SQL or whatever other means).... and thats about it.

    Basically, I guess the fuel to my fire is, I've been working with XSL and XML for about 6 years, I've tutored and trained people for 2.... and I hate it...it feels clumsy. The clumsiest bits are always around templating HTML. .

    I may just be burnt out which I did mention in my first post.

  13. #13
    SitePoint Enthusiast
    Join Date
    Jul 2004
    Location
    Nottingham
    Posts
    85
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by ikarys
    The output is the output. PDAs, PC users whatever should not be exposed to the templates, or 'behind the scenes'. (Given the current state of the web + standards, I would stear clear of client side transforms)




    Thats what I was driving at. You dont have to use XSL to template. You could use anything.
    Client side transforms?? No chance! I was referring to Sablotron et al.

    If you notice in my high level php outline, that php is mentioned twice.

    When people talk about programming and seperation, its seperation of logic. Not necesarily seperation of technologies or languages. If you want to see some good and interesting implementations of MVC, check out the templating side of Ruby on Rails.
    (I havnt checked Cake for PHP but im under the impression its very similar)
    I like the idea of Ruby on Rails, but it seems like you need to have a pretty sturdy knowledge of linux installers (that vein of stuff anyway) just to get the damn thing working. I know there's a windows installer, but I already have WAMPserver running and I'd rather sidle it up next to that in honesty. Plus, it's another language to learn, and who the hell runs hosts with Ruby installed?? (Yes, I know they're out there, but not 10 a penny exactly). Cake I can't comment on.

    Sorry if you're burnt out dude, doesn't sound like fun.
    Guess I'll get back to my umming and arring about whether to stay php or go .net...

  14. #14
    SitePoint Enthusiast konky2000's Avatar
    Join Date
    Mar 2003
    Location
    Oakland
    Posts
    71
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    This article starts out good, but it still fails to seal the deal on XML.

    Why store admin passwords in XML files rather than a MySQL table?

    For a large organization with thousands of articles why should they generate thousands of XML files to store on their web server instead of dynamically generating each article from a database?

    I understand (and always have) the advantages of XML and its self documenting structure, but it has always seemed like a very messy solution that an XML based website requires one to generate so many files.
    Konky 2000 Collections - Japanese stickers and floaty pens

  15. #15
    SitePoint Member
    Join Date
    Aug 2005
    Posts
    2
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    :) XSLT is another language to learn. Heh.

    re: PHP or .NET really depends on the type of clients you have to service.

    konky2000: Yep. I'm currently maintaining a system for large corporates which is exactly that (despite kicking and screaming). When data debugging has to be handled thru notepad and filesystem (that may be written to by the app at any time) you really have to start questioning the solution.

  16. #16
    SitePoint Member
    Join Date
    Nov 2005
    Posts
    6
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Conflict Resolution for Template Rules

    >Note the priority attribute on this template. Since an introductory paragraph
    >would match both XPath expressions, para and para[@type='intro'], we need to give
    >some indication as to which of the two templates should be used.

    No you dont. See "5.5 Conflict Resolution for Template Rules" in the spec. Specifically (i removed some non-applicable rules for clarity to this situation):

    The default priority is computed as follows:

    ...

    (*) If the pattern has the form of a QName preceded by a ChildOrAttributeAxisSpecifier or has the form processing-instruction(Literal) preceded by a ChildOrAttributeAxisSpecifier, then the priority is 0.

    ...

    (*) Otherwise, if the pattern consists of just a NodeTest preceded by a ChildOrAttributeAxisSpecifier, then the priority is -0.5.

    (*) Otherwise, the priority is 0.5.

    Thus, the most common kind of pattern (a pattern that tests for a node with a particular type and a particular expanded-name) has priority 0. The next less specific kind of pattern (a pattern that tests for a node with a particular type and an expanded-name with a particular namespace URI) has priority -0.25. Patterns less specific than this (patterns that just tests for nodes with particular types) have priority -0.5. Patterns more specific than the most common kind of pattern have priority 0.5.
    I.e. your para[@type='intro'] pattern is more specific than the vanilla para pattern, so it will *already* has a higher priority.

    You are teaching people to be ignorant of the processing model and the pattern matching conflict resolution. Kinda like teaching C++ people to always do "objPtr->Parent::method();" b/c they can. Sure it works, but is unnecessary and makes refactoring/reuse difficult. Let C++ do its inheritance and let XSL do its conflict resolution. Don't encourage bad habits, especially in an intro where the reader won't know better.

  17. #17
    Mary Johnson
    SitePoint Community Guest
    I thought early on that I was going to be told how to fill in an "img" tag in XSLT automatically from a tag in an XML file, but you never covered this. Your functional specs showed that you would be handling images and email.

  18. #18
    Marcel Schnippe
    SitePoint Community Guest
    In the first example the last "description" tag is not closing. Instead its </p>.

    You could do an online validation, e.g with schnippe.net/xml2sql or numerus other tools.

  19. #19
    SitePoint Addict
    Join Date
    Nov 2005
    Location
    Moss, Norway.
    Posts
    283
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    This thread was started and continued until yesterday in 2005. Today it is august 10 2007.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •