How strict is your dynamic language?

Tweet

Considering the “big four” dynamic, procedural languages; Perl, Python, PHP and Ruby, to an extent they’re much of a muchness, offering only small variations on the same theme (ignoring PHP’s lack of support for functional-style programming). But sometimes little things make a big difference, and perhaps most of all when your code is given input to handle which it wasn’t designed for. Knocked up a simple example to compare them in this area…

You’ve got a function which takes a hash value (an associative array) and does something with it’s contents – fairly typical logic for a database-driven application, where rows are common currency. For sake of simple example, let’s say your input is a list of names, each name broken into a hash with the keys “first” and “given”. The question is how will your function cope when the hash doesn’t have quite the structure you’re expecting (like the first name is missing), given a fairly “default” use of the language – no non-standard functionality to make the language stricter…

Perl

Tackling Perl first, here’s a working example…


#!/usr/bin/perl -w
use strict;

my $names = [
            {'first'=>'Bob','given'=>'Smith'},
            {'given'=>'Lukas'},
            {'first'=>'Mary','given'=>'Doe'},
        ];

sub printName {
    my $name = shift;
    print "Name is ".$name->{'first'}." ".$name->{'given'}."n"
}

foreach my $name ( @{$names} ) {
    printName($name);
}

Notice the list of names – the second “row” is missing the “first” key. Running this the output is;

Name is Bob Smith
Use of uninitialized value in concatenation (.) or string at names.pl line 12.
Name is  Lukas
Name is Mary Doe

Note the second line in the output – Perl is complaining about an attempt to de-reference a variable which doesn’t exist. But, after complaining, execution continues

PHP

The same again in PHP…


#!/usr/bin/php
<?php
$names = array(
    array('first'=>'Bob','given'=>'Smith'),
    array('given'=>'Lukas'),
    array('first'=>'Mary','given'=>'Doe'),
);

function printName($name) {
    print "Name is ".$name['first']." ".$name['given']."n";
}

foreach ($names as $name ) {
    printName($name);
}

The output is…

Name is Bob Smith

Notice: Undefined index:  first in /home/harryf/php/names.php on line 9
Name is  Lukas
Name is Mary Doe

PHP does pretty much the same thing as Perl – complains but continues execution. Of course the error is raised as a NOTICE – many default PHP installations have this error level disabled, meaning you’d get no error message. One other interesting point – PHP is complaining about the the array index being missing, rather than the value being, in some way, unusable.

Ruby

Here we go again, this time in Ruby…


#!/usr/bin/ruby
names = [
            {'first'=>'Bob','given'=>'Smith'},
            {'given'=>'Lukas'},
            {'first'=>'Mary','given'=>'Doe'},
        ];

def printName(name)
    puts("Name is "+name['first']+" "+name['given']+"n")
end

names.each {
    |name| printName(name)
}

And the output…

Name is Bob Smith
names.rb:10:in `+': can't convert nil into String (TypeError)
    from names.rb:10:in `printName'
    from names.rb:14
    from names.rb:13

Ruby, by contrast, halts execution (raises an exception that wasn’t handled) on encountering the missing hash key. Interesting is the cause: Ruby is saying that de-referencing something that doesn’t exist creates a nil value, which you can’t just join to a string, hence the exception – a TypeError.

Python

Finally Python…


#!/usr/bin/python
names = [
            {'first':'Bob','given':'Smith'},
            {'given':'Lukas'},
            {'first':'Mary','given':'Doe'},
        ];

def printName(name):
    print("Name is "+name['first']+" "+name['given']+"n")

for name in names:
    printName(name)


And how does Python respond?

Name is Bob Smith

Traceback (most recent call last):
  File "names.py", line 12, in ?
    printName(name)
  File "names.py", line 9, in printName
    print("Name is "+name['first']+" "+name['given']+"n")
KeyError: 'first'

Like Ruby, Python halts execution on encountering the problem hash (dict), raising an exception. Unlike Ruby, but more like PHP, Python is complaining about the missing hash key – it raised a KeyError exception.

So what?

OK – this is a trivial example but I also think it’s illustrative of a common recurring problem and the fundamental philosophical differences between the languages.

While prototyping, you were likely using sample data you’d created yourself and it was probably “perfect” input. But when it comes to putting you code into production, and having it survive a few jumps in version number, real data starts flowing around and functions / classes start getting applied in different contexts (perhaps not by you) resulting this kind of “unexpected input” problem.

And, yes, you should always write tests but when it comes to coarse grained APIs handling complex data structures, there’s always something you’re missing a test for. In fact I’d argue this general category of problem is one of the biggest sources of bugs in “dynamic code” – the languages encourage rapid prototyping, which means those extra 48 lines of code to make it really robust happen much later, if at all. So a language’s default behavior is significant to my mind.

So one dividing line here is whether some kind of fatal error should be raised. Perl and PHP continue execution by default while Ruby and Python halt, unless you explicitly handle the problem. Which is better?

Think there are pros and cons to both. For web apps, at least where non-critical data is involved, having your code “keep on running”, albeit in “half broken” state is often a good thing – it gives you time to address the problem later without users seeing pages as being “down”. At the same time, you don’t know what the consequences of the problem are. In this example they’re trivial and Perl / PHP’s behavior results in something that’s still usable, but I’d feel a lot less confident if it was an operation where data / state is being change – better everything works perfectly or nothing at all, rather than ending up with some kind of inconsistent state – transactions and all that.

The other dividing line here is what the languages are actually complaining about. Perl and Ruby are concerned that the value is in some way bad. Python and PHP are much more explicit, complaining precisely about the missing hash key.

From the point of view of locating the problem in the code later, based purely on an error message in a log file, think the Python / PHP way is better – it tells you exactly what went wrong. Also, specific to Ruby, that it raised an exception seems to be partly good luck – I tried to embed the value in a string, which raised the error immediately. But what if I’d done something acceptable with the nil value – perhaps just assigned it to another value? Probably the exception would show up somewhere but not immediately, at the point I’d de-referenced it – debugging joy I’d imagine (flame me if I’m wrong).

Anyway, just for fun (read: trolling), here’s a subjective strictness ranking…

  1. Python (strictest)
  2. Ruby
  3. Perl / PHP

That seems congruent with Python’s philosophy. I put Perl and PHP vying for the last place because PHP allows you to switch off those E_NOTICE errors – otherwise would have placed PHP as stricter than Perl, for having spotted the missing hash key.

If anyone’s willing, would be interesting to compare the example with IO, Lua and Erlang, where possible (side note: I did get the point – just pressing the PHP buzzer for fun – fascinated to see what Damien comes up with in CouchDB, especially the part about “Distributed, featuring robust, incremental replication with bi-directional conflict detection and resolution.”).

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • AJL

    If you don’t want Python to be strict (in this context) you could have it return a default value for a key if it isn’t found by using the dict’s ‘get’ method.

    print “Name is “+name.get(‘first’, ”)+” “+name.get(‘given’, ”)+”n”

  • http://www.phppatterns.com HarryF

    If you don’t want Python to be strict (in this context) you could have it return a default value for a key if it isn’t found by using the dict’s ‘get’ method.

    Thanks for pointing out. What’s good about that is it’s explicit – if you do that, you have to give some thought to the impact. I like that about Python – default to “safe behaviour” while providing the tools to override when needed.

  • Simon Proctor

    Of course if you removed the -w from the call to perl you wouldn’t get your warning. Default Perl without the warnings and strict pragma is so forgiving it’s scary. (Though it can be useful for some one off comandline stuff).

  • DerelictMan

    The code you posted is pretty common in the PHP and Perl worlds, but I’m guessing that in Ruby and Python you’d probably see people creating simple “Name” classes with two attributes, “first” and “given”. Then one could force that they are both provided in the constructor (raising an exception there if they are not valid), or alternatively overridding the property getter to return a sane non-nil value if the property isn’t defined. That’d look a lot cleaner too, as it would be something like:

    puts(“Name is “+name.first+” “+name.given+”n”)

    (I’m guessing here, as I haven’t used Ruby or Python too much yet…)

  • http://www.phppatterns.com HarryF

    Of course if you removed the -w from the call to perl you wouldn’t get your warning. Default Perl without the warnings and strict pragma is so forgiving it’s scary. (Though it can be useful for some one off comandline stuff).

    I should have said something about that I guess – to my mind the benefits of using -w and “use strict” have been hammered out so often that they should be default for anyone writing even half-serious Perl. But I guess it is a violation of this test, examining the default behaviour.

    in Ruby and Python you’d probably see people creating simple “Name” classes with two attributes, “first” and “given”.

    Maybe – maybe not. Think there are different schools on that – some for and others resenting the extra lines of code for the class definition. The view of the functional programmer might also vary on this point. Also, in PHP, think it’s pretty common these days to use a class for this kind of purpose these days.

  • Joshua E Cook

    Ruby also allows you to specify a default value when retrieving a member from a Hash, using the Hash#fetch method. Furthermore, Ruby would not halt execution if you had used string interpolation or sprintf-style string formatting.

  • Anonymous

    Harry,

    If you wanted to that Hash to generate a default value when it does not find a key (which it can) I’d do the following in your code

    #!/usr/bin/ruby
    def get_hash(hash)
    Hash.new("").update(hash)
    end

    names = [
    get_hash('first'=>'Bob','given'=>'Smith'),
    get_hash('given'=>'Lukas'),
    get_hash('first'=>'Mary','given'=>'Doe'),
    ];

    def printName(name)
    puts("Name is "+name['first']+" "+name['given']+"n")
    end

    names.each {
    |name| printName(name)
    }

    This will make the hash generate the default string

  • http://www.revonx.com/ Galo

    You forgot Haxe http://www.haxe.org/ ;)

    nice article harry..

  • JaredWhite

    Now this is the kind of language comparison I can get behind (getting sick of Rails rulz! and PHP teh sux0r! and all that).

    One comment: if you had an inkling in advance that one of the array keys might be missing, you could use the error suppression operator (@) to get rid of the notice (i. e., @name['first'] ). In PHP templating, this is often very useful because you might want to echo out content only if it exists, otherwise you can just ignore it (and sprinkling if’s all over the place gets old fast).

  • Chris Adams

    I use a custom error handler (set_error_handler()) on PHP which mimics the Python behaviour by stopping for any error condition; this is also a good place to use things like debug_backtrace() to dump a stack trace which includes the values passed to functions or database-specific information (e.g. sql error, connection info) when appropriate.

  • malikyte

    Installing Xdebug (for PHP) on a server will invariably call some sort of debugging backtrace when a script halts unexpectedly. It also color-codes print_r and var_dump output. I like color. :)

  • doug

    I don’t think it’s really debatable which is the best behaviour. Failing properly when something goes wrong is such an advantage for real software that it’s even got a name (FailFast). It’s an important principle and one that should be used in all ‘real’ software. Debating which language is better for non-‘real’ software is not, imo, very interesting :)

  • DerelictMan

    Failing properly when something goes wrong is such an advantage for real software that it’s even got a name (FailFast). It’s an important principle and one that should be used in all ‘real’ software.

    Since you feel that way, I’m assuming that you must not consider any dynamic language appropriate for ‘real’ software, since languages that defer type checking until runtime when they could be checking types at compile time obviously haven’t put FailFast at the top of their priority list…

  • Rob Walker

    You can still consider dynamic languages appropriate in a FailFast context. It’s not about type checking. It’s not about undiscovered errors. It’s about what happens when the language knows an error occurred, whether or not it’s due to a type error, and whether or not it raised an error for a variable that was binded late or early. Either the program is allowed to continue or it’s halted after an error.

    If an error is consistent in its behavior and causes, then it’s easier to find and fix, and you can be more confident your code isn’t a mess. It’s just a simple bug.

    If an error is sporadic and there seems to be little rhyme or reason to its causes and effects, the bug is harder to fix. It’s also a reason to start wondering if there aren’t many other subtle problems with your code.

    Just my two cents.

  • Rob Walker

    P.S. I think the second two paragraphs I wrote might be unclear.

    It *is* about undiscovered errors in the sense that although continuing after an error is in some cases desireable, the rule and not the exception is that it’s best not to do so.

    But in any case it should be a deliberate decision to let the program continue, not the default behavior of the language. The last thing you want is to believe everything is fine meanwhile your data is quietly getting corrupted as the program continues. Halting the program is one obvious way to prevent that.

    I believe a language should trust the programmer to know when to catch an error and when to let it pass. But to do that, the programmer must at least be given the decision first.

    From The Zen of Python:

    Errors should never pass silently.
    Unless explicitly silenced.

  • DerelictMan

    You can still consider dynamic languages appropriate in a FailFast context.

    Well, first of all, I was playing devil’s advocate; I’m a fan of both dynamically-typed languages AND statically typed ones (different tools for different tasks). I was attempting to point out the fact that FailFast isn’t a black or white philosophy (as it seems that doug above is a fan of absolutes). There are degrees of failing fast, and while it’s a good general approach, obviously sometimes other considerations take priority.

    It’s not about type checking.

    I disagree. It’s not ONLY about type checking, but type errors *are* a common source of application bugs. Catching a parameter with an incorrect type at the beginning of a method invocation (say, with a type hint) is “fail fast” when compared with not checking the type and then 20-30 lines of code later attempting to call an undefined method on that parameter. If you are one to see “fail fast” as an absolute rule, then obviously you would prefer to know as soon as the method is called that the parameter type is wrong. My contention is that if you take the fail fast philosophy to its extreme (which is probably ill-advised), you must find static, compile-time type checking to be the best approach (since that is about as early in the process as you can get), and you will have to abandon dynamically typed languages altogether.

    Example (obviously contrived):

    someMethod();
    ?>

    This is perfectly legal PHP code, yet it will fail at runtime about half the time. With static typing, the language would “know” that an error exists, and would refuse to compile it. However, dynamically typed languages would accept the equivalent of the above code without any complaints. Hence, they are not “failing fast”.

    However, clearly dynamically typed languages have some (or many, depending on whom you ask) advantages over statically typed ones in certain contexts. My only point is that sometimes other concerns take priority over failing as fast as possible.

  • DerelictMan

    Oops, I guess I didn’t properly escape my tags in the code snippet above (there is apparently no way to preview your comments here); let me try again:

    class Foo {}
    $f = new Foo();
    if (mt_rand(1, 100) == 1) $f->someMethod();

    But in any case it should be a deliberate decision to let the program continue, not the default behavior of the language. The last thing you want is to believe everything is fine meanwhile your data is quietly getting corrupted as the program continues.

    Well put. This is one of the reasons I’ve never been a fan of MySQL (but that’s a whole other discussion…)

  • The New Guy

    I think the classification of weak/strong typing when expanded is a pretty good correctly placing the languages.

    PHP and perl are weak/dynamic.
    Ruby and python are strong/dynamic.

    And just for contrast:

    C++ is weak/static
    Java is strong/static

    In terms of error reporting, I think we have known for quite awhile that Ruby is pretty crappy at it.

  • Fenrir2

    Ruby returns a nil if a value in a hash doesn’t exist. This is useful if you want to check for a key:

    if name[:first]
    printName(name)
    else
    puts 'No first name'
    end

    If you want a stricter behavior use Hash#fetch instead of Hash#[].

    def printName(name)
    puts "Name is #{name.fetch :first} #{name.fetch :given}"
    end

    Now the script will raise an IndexError.

  • leeschen

    Harry F wrote: “So one dividing line here is whether some kind of fatal error should be raised. Perl and PHP continue execution by default while Ruby and Python halt, unless you explicitly handle the problem. Which is better?”

    This depends entirely on the nature of the error. A major failure that is likely to really screw the works should abort with a truly meaningful message. Many languages allow explicit handling of such events and these should be used where possible. Not all possible error conditions can be forseen, but as many as can be should be given this treatment.

    Lesser problems can usually be permitted to continue, though perhaps with a warning message and a return to an input field. A custom error handler, such as that mentioned by Chris Adams can help with both types of problems, but I need to ask Chris if he uses it only during debug, or if it remains in the finished code.

    Rob Walker makes an excellent point when he says, “The last thing you want is to believe everything is fine meanwhile your data is quietly getting corrupted as the program continues.”

    There is a real problem with programs that continue with no error control. Subtle errors can easily be introduced into data and soon an entire database is contaminated and totally useless. My primary background is in database programing and, trust me, this is a type of problem you do not want to discover after a client has invested thousands of dollars in data collection/entry efforts. If the language has no intrinsic method to validate data, the programmer must implicitly code data checking with appropriate action before that data is used or stored.

    I prefer to accomplish this myself and specify the program response rather than depend only upon the language’s intrinsic type checking. I do have to admit, however, that I am rather a newcomer to web scripting, client or server, but this means I have no hard and fast favorites. Right now I am improving my JavaScript and just beginning with PHP, so for the most part, I will just sit back and learn from y’all.

    Lee Eschen

  • Pingback: Sascha Goebels WebLog » Blog Archive » How strict is your dynamic language

  • Ed Borasky

    Ah … but you *wouldn’t* write a web application in Perl, Python, PHP or Ruby (or Java, etc.) — you’d write it in a *framework*, like Rails! The question isn’t “How strict is your language?” The question is, “How good a web application programmer are you?” and “How well does your *framework* support designing secure, scalable, etc. web applications?”

  • stu

    comming late to the party ;) here is lua 5.1 example

    names = {
    { first = "Bob", given = "Smith"},
    { given = "Lukas"},
    { first = "Mary", given ="Doe"},
    };

    function printName(name)
    print("Name is " .. name.first .. " " .. name.given)
    end

    for x,y in pairs(names) do
    printName(y)
    end

    and the runtime output

    C:lua5.1>lua5.1.exe x.lua
    Name is Bob Smith
    lua5.1.exe: x.lua:9: attempt to concatenate field 'first' (a nil value)
    stack traceback:
    x.lua:9: in function 'printName'
    x.lua:13: in main chunk
    [C]: ?

    Lua unfortunatly (imo) likes to co-erce things into strings and numbers when it can without explicitly being told to do so.