Programming - - By Harry Fuecks

How strict is your dynamic language?

Considering the “big four” dynamic, procedural languages; Perl, Python, PHP and Ruby, to an extent they’re much of a muchness, offering only small variations on the same theme (ignoring PHP’s lack of support for functional-style programming). But sometimes little things make a big difference, and perhaps most of all when your code is given input to handle which it wasn’t designed for. Knocked up a simple example to compare them in this area…

You’ve got a function which takes a hash value (an associative array) and does something with it’s contents – fairly typical logic for a database-driven application, where rows are common currency. For sake of simple example, let’s say your input is a list of names, each name broken into a hash with the keys “first” and “given”. The question is how will your function cope when the hash doesn’t have quite the structure you’re expecting (like the first name is missing), given a fairly “default” use of the language – no non-standard functionality to make the language stricter…

Perl

Tackling Perl first, here’s a working example…


#!/usr/bin/perl -w
use strict;

my $names = [
            {'first'=>'Bob','given'=>'Smith'},
            {'given'=>'Lukas'},
            {'first'=>'Mary','given'=>'Doe'},
        ];

sub printName {
    my $name = shift;
    print "Name is ".$name->{'first'}." ".$name->{'given'}."n"
}

foreach my $name ( @{$names} ) {
    printName($name);
}

Notice the list of names – the second “row” is missing the “first” key. Running this the output is;

Name is Bob Smith
Use of uninitialized value in concatenation (.) or string at names.pl line 12.
Name is  Lukas
Name is Mary Doe

Note the second line in the output – Perl is complaining about an attempt to de-reference a variable which doesn’t exist. But, after complaining, execution continues

PHP

The same again in PHP…


#!/usr/bin/php
<?php
$names = array(
    array('first'=>'Bob','given'=>'Smith'),
    array('given'=>'Lukas'),
    array('first'=>'Mary','given'=>'Doe'),
);

function printName($name) {
    print "Name is ".$name['first']." ".$name['given']."n";
}

foreach ($names as $name ) {
    printName($name);
}

The output is…

Name is Bob Smith

Notice: Undefined index:  first in /home/harryf/php/names.php on line 9
Name is  Lukas
Name is Mary Doe

PHP does pretty much the same thing as Perl – complains but continues execution. Of course the error is raised as a NOTICE – many default PHP installations have this error level disabled, meaning you’d get no error message. One other interesting point – PHP is complaining about the the array index being missing, rather than the value being, in some way, unusable.

Ruby

Here we go again, this time in Ruby…


#!/usr/bin/ruby
names = [
            {'first'=>'Bob','given'=>'Smith'},
            {'given'=>'Lukas'},
            {'first'=>'Mary','given'=>'Doe'},
        ];

def printName(name)
    puts("Name is "+name['first']+" "+name['given']+"n")
end

names.each {
    |name| printName(name)
}

And the output…

Name is Bob Smith
names.rb:10:in `+': can't convert nil into String (TypeError)
    from names.rb:10:in `printName'
    from names.rb:14
    from names.rb:13

Ruby, by contrast, halts execution (raises an exception that wasn’t handled) on encountering the missing hash key. Interesting is the cause: Ruby is saying that de-referencing something that doesn’t exist creates a nil value, which you can’t just join to a string, hence the exception – a TypeError.

Python

Finally Python…


#!/usr/bin/python
names = [
            {'first':'Bob','given':'Smith'},
            {'given':'Lukas'},
            {'first':'Mary','given':'Doe'},
        ];

def printName(name):
    print("Name is "+name['first']+" "+name['given']+"n")

for name in names:
    printName(name)


And how does Python respond?

Name is Bob Smith

Traceback (most recent call last):
  File "names.py", line 12, in ?
    printName(name)
  File "names.py", line 9, in printName
    print("Name is "+name['first']+" "+name['given']+"n")
KeyError: 'first'

Like Ruby, Python halts execution on encountering the problem hash (dict), raising an exception. Unlike Ruby, but more like PHP, Python is complaining about the missing hash key – it raised a KeyError exception.

So what?

OK – this is a trivial example but I also think it’s illustrative of a common recurring problem and the fundamental philosophical differences between the languages.

While prototyping, you were likely using sample data you’d created yourself and it was probably “perfect” input. But when it comes to putting you code into production, and having it survive a few jumps in version number, real data starts flowing around and functions / classes start getting applied in different contexts (perhaps not by you) resulting this kind of “unexpected input” problem.

And, yes, you should always write tests but when it comes to coarse grained APIs handling complex data structures, there’s always something you’re missing a test for. In fact I’d argue this general category of problem is one of the biggest sources of bugs in “dynamic code” – the languages encourage rapid prototyping, which means those extra 48 lines of code to make it really robust happen much later, if at all. So a language’s default behavior is significant to my mind.

So one dividing line here is whether some kind of fatal error should be raised. Perl and PHP continue execution by default while Ruby and Python halt, unless you explicitly handle the problem. Which is better?

Think there are pros and cons to both. For web apps, at least where non-critical data is involved, having your code “keep on running”, albeit in “half broken” state is often a good thing – it gives you time to address the problem later without users seeing pages as being “down”. At the same time, you don’t know what the consequences of the problem are. In this example they’re trivial and Perl / PHP’s behavior results in something that’s still usable, but I’d feel a lot less confident if it was an operation where data / state is being change – better everything works perfectly or nothing at all, rather than ending up with some kind of inconsistent state – transactions and all that.

The other dividing line here is what the languages are actually complaining about. Perl and Ruby are concerned that the value is in some way bad. Python and PHP are much more explicit, complaining precisely about the missing hash key.

From the point of view of locating the problem in the code later, based purely on an error message in a log file, think the Python / PHP way is better – it tells you exactly what went wrong. Also, specific to Ruby, that it raised an exception seems to be partly good luck – I tried to embed the value in a string, which raised the error immediately. But what if I’d done something acceptable with the nil value – perhaps just assigned it to another value? Probably the exception would show up somewhere but not immediately, at the point I’d de-referenced it – debugging joy I’d imagine (flame me if I’m wrong).

Anyway, just for fun (read: trolling), here’s a subjective strictness ranking…

  1. Python (strictest)
  2. Ruby
  3. Perl / PHP

That seems congruent with Python’s philosophy. I put Perl and PHP vying for the last place because PHP allows you to switch off those E_NOTICE errors – otherwise would have placed PHP as stricter than Perl, for having spotted the missing hash key.

If anyone’s willing, would be interesting to compare the example with IO, Lua and Erlang, where possible (side note: I did get the point – just pressing the PHP buzzer for fun – fascinated to see what Damien comes up with in CouchDB, especially the part about “Distributed, featuring robust, incremental replication with bi-directional conflict detection and resolution.”).

Sponsors