SitePoint Sponsor

User Tag List

Results 1 to 15 of 15
  1. #1
    SitePoint Wizard gold trophysilver trophy
    Join Date
    Nov 2000
    Location
    Switzerland
    Posts
    2,479
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    RFC: Dynamically typed web services?

    Looking at SOAP (RPC/Encoded) and to some extent XML-RPC, I wonder if a simpler form of XML format could be defined that benefits dynamically typed languages rather than the situation we have today? Encoded SOAP, in particular, suits strongly typed languages (basically because it uses XML schema) while causing problems for dynamically typed languages (the ones that actually run the Internet) where the types aren't explicitly declared.

    Taking XML-RPC as the example, my experiences with it (use it alot at work) have been I never use the <dateTime.iso8601/> or <base64 /> values ever. If there's dates or base64 encoded binary data to be exchanged, it's always done with string values.

    More to the point, I wonder if the scalar typing ( <int />, <boolean />, <double />, <string />) could be thrown out completely, using simply <value /> as the only scalar. The <struct /> and <data /> elements could remain as for an XML parser, it helps to know when a more complex tag heirarchy has to be handled.

    For dynamically typed languages really think this wouldn't break any code and would certainly make life alot easier building web services implementations. In general, developers write apps based on the API description rather than waiting to see what types they make a method call.

    For SOAP, where the current situation with WSDL is really holding back building SOAP servers in dynamically typed languages, we could come up with an encoding which is ideally suited to dynamically typed languages, based on the current encoding: http://schemas.xmlsoap.org/soap/encoding/, keeping the Struct and Array data types but chucking out all the rest and replacing them with XML schema anyType perhaps.

    That could help alot to make the situation better for loosely typed languages, making WSDL alot easier to deal with on the server side.

  2. #2
    SitePoint Wizard silver trophy Jeremy W.'s Avatar
    Join Date
    Jun 2001
    Location
    Toronto, Canada
    Posts
    9,121
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    That would defeat the purposes of web services and XML completely: to allow for the uniform distribution and dissemination of data.

    It has nothing to do with the language at play (loosely vs strongly typed) but with the reality that if you are using web services, it's far better to do the work when you create the SOAP/XML than afterwards.

    J
    SVP Marketing, SoCast SRM
    Personal blog: Strategerize
    Twitter: @jeremywright

  3. #3
    Database Jedi MattR's Avatar
    Join Date
    Jan 2001
    Location
    buried in the database shell (Washington, DC)
    Posts
    1,107
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Having non-typed would basically be a non-valid XML document.

  4. #4
    SitePoint Wizard gold trophysilver trophy
    Join Date
    Nov 2000
    Location
    Switzerland
    Posts
    2,479
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    That would defeat the purposes of web services and XML completely: to allow for the uniform distribution and dissemination of data.
    Certainly it would make life harder for statically typed languages but they could test what a value contains as they get it and make a "best guess" as to what the type is.

    It has nothing to do with the language at play (loosely vs strongly typed) but with the reality that if you are using web services, it's far better to do the work when you create the SOAP/XML than afterwards.
    Ah but for dynamically typed languages, it's a problem generating WSDL, which is a great technology but only really accessible if you're using a code generation tool. Also for document literal services (which is the default for .NET I believe) theres no type encoding in the SOAP body itself. If building WSDL became (almost) as easy as building a "sitemap", web services might become alot more popular, taking off especially for content synidication.

    Having non-typed would basically be a non-valid XML document.
    That's a fascinating theoretical point of view - guess you're right. My response is it's not non-typed but any-typed (for scalar values).

    Thing is my practical experience with XML-RPC, Perl and PHP is the only thing you really care about is the data heirarchy (i.e. compound types like arrays and structs [arrays of objects / associative arrays]), not scalar types.

    It's much like a database I guess. Although you have a database schema which does define scalar types (I'm not aware of a database that does compond types well unless it's XML based), when you perform a query, you dont actually care about the types in the database. For a statically typed language you define the types as you extract the data from the query result set e.g.

    Code:
    ResultSet rs = conn.createStatement(); 
    st.executeQuery("SELECT email FROM users");
    
    while (rs.next()) { 
       String email = rs.getString("email"); 
       System.out.println(email + "\n"); 
    }

  5. #5
    SitePoint Wizard gold trophysilver trophy
    Join Date
    Nov 2000
    Location
    Switzerland
    Posts
    2,479
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Couple of links BTW.

    YAML: http://www.yaml.org/

    YAML(tm) (rhymes with "camel") is a straightforward machine parsable data serialization format designed for human readability and interaction with scripting languages such as Perl and Python. YAML is optimized for data serialization, configuration settings, log files, Internet messaging and filtering.
    Although it's not XML it does make it possible to take the "who cares about scalar types" point of view as the example demonstrates. Like XML-RPC, unidentified scalar types default to string.

    Also Dave Winer made an interesting remark here

    He also questioned the impact the large proprietary-biased vendors will have on the development of open Web services. Pointing at IBM and Microsoft, Winer said WSDL (Web Services Description Language) was invented in such a way that it will only work in Java and .Net environments. "It can't work in a dynamic environment; it's a static interface," he said.
    Now history has proved him wrong on the "can't work" because it can. On the client side, WSDL makes it incredibly easy for dynamically typed languages. On the server side, the PEAR::SOAP solution to generating WSDL is to come up with something along the lines of IDL (interface definition language) which developers still have to manually code for their classes e.g....

    PHP Code:
            $this->__typedef['{http://soapinterop.org/xsd}SOAPStruct'] = 
                        array(
                            
    'varString' => 'string',
                            
    'varInt' => 'int'
                            
    'varFloat' => 'float'
                             
    ); 

    ...but is big improvement over rolling your own WSDL.

  6. #6
    SitePoint Wizard silver trophy Jeremy W.'s Avatar
    Join Date
    Jun 2001
    Location
    Toronto, Canada
    Posts
    9,121
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Dave Winer? I'm sorry, you're quoting Dave Winer as a reliable tech source?

    He's the Jerry Springer of development... Well, not quite because he doesn't have his own show, but he perfers to stir things up than to get things done.

    The reality is that having an extra type of web service means more work, not less. You'll have one group of "just like now" web services and another of "ooh, we don't like that way, let's try this 'new standard'" web service.

    The reality is that companies have always and will always be involved in the creation of standards. The fact that MS and IBM were involved is nothing new and in no way compromises the standard.

    What does is people making new ones of their own because tehy dont' know how the current ones work.

    To me that's immature, irresponsible and lazy. Don't like the way it's done, so find another?



    Sorry Harry, but you lost me mate. 3 months ago you were saying that Web Services were amazing, now they're not, now it's XUL, then it's .NET is great but PHP can do it better, then it's MS is interfering... Do you ever settle down and just research things first?

    J
    SVP Marketing, SoCast SRM
    Personal blog: Strategerize
    Twitter: @jeremywright

  7. #7
    SitePoint Wizard gold trophysilver trophy
    Join Date
    Nov 2000
    Location
    Switzerland
    Posts
    2,479
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    He's the Jerry Springer of development... Well, not quite because he doesn't have his own show, but he perfers to stir things up than to get things done.
    There's was a great blog somewhere about having a "Winer number" (like a Bacon number) for degrees of seperation between you and Dave for knowing someone "abused" by him online. So be careful what you say;

    He's the Jerry Springer of development...
    Currently he's stirring things up, or being stirred up with RSS 2.0.

    But I do think he's on the nail about one thing in that article

    "It has got to start simple or it has no chance," he said. "If you don't understand [a new technology] first off, and it makes your mind go numb, you're safe to ignore it, [because] it will never work," he said.
    By "never work" at the end of that quote I read "never go mainstream". Where you have to respect Dave though is XML-RPC is a success, thanks to the blogger APIs, while looking at the full list of services on XMethods, if that's any indication, there's basically nothing happening despite SOAP having far exceeded the the kind of "presence" XML-RPC has if we talk about vendor support. UDDI I think it's safe to say has already been a flop.

    Dave's not the only one saying web services in their current form isn't going to fly. Edd Dumbill makes a more detailed analyis here basically summarizing to; "web services" as is are only going to succeed inside corporate networks as a tool for integration. Most people don't even get the concept let alone grasping the detail of SOAP and WSDL.

    What does is people making new ones of their own because tehy dont' know how the current ones work.

    To me that's immature, irresponsible and lazy.
    For dynamically typed languages it's not a question of not knowing how the current standards work. The problem is getting from source code to WSDL for a language where types are not explicitly declared and thats a real problem because if you update your code, you want the WSDL to reflect that change automatically (as is possible for .NET and Java) otherwise you'll likely have a WSDL description which either contains human error or is out of date.

    I agree that YAML is probably not a great idea simply because it's wasted effort - this can already be done with XML. But by defining a SOAP encoding geared to favor loosely typed languages, web services may gain greater adoption.

    By ditching the explicit declaration of static types, WSDL would at least get easier to generate for dynamically typed languages, the problem narrowing down to how you spot arrays and struts in your code.

    3 months ago you were saying that Web Services were amazing, now they're not
    I'm not saying they're not. What I'm saying is if SOAP/WSDL becomes as easy to use as a database, for languages like Perl, PHP and Python, they stand a much better chance of becoming mainstream technology approaching HTML.

    Now it's XUL, then it's .NET is great but PHP can do it better, then it's MS is interfering... Do you ever settle down and just research things first?
    Exactly which point do you feel was unresearched? Or perhaps I should ask what is it I've said that got your back up? If you think I don't know what I'm talking about, please call me on it specifically rather than suggesting the possibility.

    Going back to the original point - got sidetracked - it was not a question of whether SOAP / WSDL is the hobby horse of any particular company.

    Really interested in hear opinions on the impact of ditching static types in XML messaging formats for dynamically typed languages. If the XML-RPC spec was minimized to the following types;

    Code:
    <!-- Scalar -->
    <value>A scalar value</value>
    
    <!-- Array -->
    <array>
        <data>
            <value>Red</value>
            <value>Blue</value>
            <value>Green</value>
        </data>
    </array>
    
    <!-- Struct -->
    <struct>
         <member>
             <name>lowerBound</name>
             <value>18</value>
         </member>
         <member>
             <name>upperBound</name>
             <value>139</value>
         </member>
    </struct>
    Is there anything, for dynamically typed languages, that will break apps? Note that this already complies with XML-RPC, <value>12345</value> defaulting to a string.

  8. #8
    Sultan of Ping jofa's Avatar
    Join Date
    Mar 2002
    Location
    Svj
    Posts
    4,080
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Jeremy W.
    ... it's far better to do the work when you create the SOAP/XML than afterwards.
    Just wanted to say that I agree

    I really like the strongly typed, self-descriptive approach, especially after an excursion I made recently into the inferno of plain evil text files, trying to dissect the lines and extract data from position x to position y etc

  9. #9
    SitePoint Wizard silver trophy Jeremy W.'s Avatar
    Join Date
    Jun 2001
    Location
    Toronto, Canada
    Posts
    9,121
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Exactly. I don't really see the major advantages to something that Harry's proposing over a comma delimmeted system. Heck, comma delimmetted has whole engines built to write and parse, and so should be easier, is more standard, etc
    SVP Marketing, SoCast SRM
    Personal blog: Strategerize
    Twitter: @jeremywright

  10. #10
    Sultan of Ping jofa's Avatar
    Join Date
    Mar 2002
    Location
    Svj
    Posts
    4,080
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Who said comma delimited? The text files I was talking about in my previous post looked like this:
    Code:
    
    ...
    40                  39407150000000147700        0000000045                      
    40                  39407490000000113400        0000000044                      
    40                  39407560000000115400        0000000043   
    ...
    

  11. #11
    SitePoint Wizard silver trophy Jeremy W.'s Avatar
    Join Date
    Jun 2001
    Location
    Toronto, Canada
    Posts
    9,121
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Same thing to me:

    Code:
     
    id	  name	  num
    1	   james	 12
    2	   josh	   15
    ...
    I was talking more about that format than anything. Concept of flatfiles more than CSV really

    J
    SVP Marketing, SoCast SRM
    Personal blog: Strategerize
    Twitter: @jeremywright

  12. #12
    SitePoint Wizard gold trophysilver trophy
    Join Date
    Nov 2000
    Location
    Switzerland
    Posts
    2,479
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    ...hence "RFC" in the title - just wondering if anyone has ideas for how to make things easier for dynamically typed languages. Been thinking more about this and still don't see it.

    don't really see the major advantages ... over a comma delimmeted system.
    XML as is (without the typing XSD gives it) is a small step from CSV - at least you wont break if your data has a comma it it and it's easier to have flexibile data heirarchies while with CSV you have to stick to rigid columns.

    Jofa;

    Who said comma delimited? The text files I was talking about in my previous post looked like this:
    That sort of format (fixed column widths) is one which Python handles beautifully. Recently had to do something similar, formatting the output of Windows command line tools. Python has is able to treat strings as tuples (indexed arrays) and grab pieces like

    Code:
    chunk = string[11:15]
    So my code goes something like;

    Code:
    class ColumnDataParser:
        def __init__: pass
        def parse(self,data):
            '''Split data by newlines
            data = data.split("\n")
            if len(data) == 0:
                raise ValueError, "Empty data"
            '''Pop off the column headings
            data.pop(0)
            lines = []
            for line in data:
                lines.append(self._parseLine(line))
            return lines
        def _parseLine(self,line):
            '''Parse line by column positions
            parsedLine = {
                    'Col1':line[0:21].strip(),
                    'Col2':line[22:40].strip()
                }
            return parsedLine
    Now way off topic

  13. #13
    Database Jedi MattR's Avatar
    Join Date
    Jan 2001
    Location
    buried in the database shell (Washington, DC)
    Posts
    1,107
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by HarryF
    That's a fascinating theoretical point of view - guess you're right. My response is it's not non-typed but any-typed (for scalar values).
    I think I misspoke when I said 'non-typed' since the only way something is non-typed is if it is null; the idea is the same, though. Any-typed XML docs (e.g. you don't care what it is) is an XML doc which is not valid (e.g. no schema can be applied to it).

    Perhaps you can create a schema which has a datatype of 'any', but then all it does is forces the data to conform to a particular hierarchy... and I can't think of many situations in which you'd want to force the data to a hierarchy but NOT the datatype.

    That is to say, if you were creating an XML document of threads, you'd know that when creating the file you'd never stuff anything *but* a numeric for thread id. As a matter of fact you'd never stuff anything *but* the thread_id in there, so it would never arise that you'd stick anything else in there.

  14. #14
    SitePoint Wizard silver trophy Karl's Avatar
    Join Date
    Jul 1999
    Location
    Derbyshire, UK
    Posts
    4,411
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by HarryF
    Certainly it would make life harder for statically typed languages but they could test what a value contains as they get it and make a "best guess" as to what the type is.
    The whole idea of XML is to make it so that you don't have to best guess what type data is etc. XML is the data description and the data wrapped up, XML should describe what the data is so guessing doesn't have to be done - if you start bringing guess work in then we are back where we started with CSV, Tab Deliminated etc. etc.

    Just my 0.02 worth.
    Karl Austin :: Profile :: KDA Web Services Ltd.
    Business Web Hosting :: Managed Dedicated Hosting
    Call 0800 542 9764 today and ask how we can help your business grow.

  15. #15
    SitePoint Wizard silver trophy Jeremy W.'s Avatar
    Join Date
    Jun 2001
    Location
    Toronto, Canada
    Posts
    9,121
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    My point exactly

    J
    SVP Marketing, SoCast SRM
    Personal blog: Strategerize
    Twitter: @jeremywright


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •