Borrowing Techniques from Strongly Typed Languages in JS
In this article we’ll discuss how you can use techniques from strongly typed languages in your JavaScript code. The techniques introduced will both reduce bugs in your code, and allow you to reduce the total amount of code you need to write. Although this article uses JavaScript as the example, you can also apply these techniques to most other languages with weak typing.
The JavaScript Type System
Let’s first do a quick recap on how the JavaScript data type system works. JavaScript splits its values into two categories:
- Primitive types, such as
String
,Number
andBoolean
. When you assign a primitive type to a variable, you always create a new value which is a copy of the value you’re assigning. - Reference types, such as
Object
andArray
. Assigning reference types always copies the same reference. To clarify this, let’s look at the following code example:
var a = [];
var b = a;
a.push('Hello');
The variable b
will change when we change a
, because they are both references to the same array. This is how all reference types work.
JavaScript does not enforce types in any way, meaning that any variable can hold any data type at any point in time. The rest of this article will discuss the downsides of this, and how you can apply simple techniques from languages which do enforce types to write better JavaScript.
Introducing the Rule of Consistent Types
The rule of consistent types is simple in theory: all values should only have one type. Strongly typed languages enforce this on the compiler level, they will not let you mix and match types arbitrarily.
Weak typing gives us a great amount of freedom. A common example of this is concatenating numbers into strings. You don’t need to do any tedious type casting like you would have to do, for example, in a language like C.
Don’t worry, I won’t tell you to throw away all the convenience. The rule of consistent types only requires you to pay some attention to how your variables and functions behave, and as a result, your code will improve.
Types in Variables
First, let’s look at how the rule applies to variables. It’s very straightforward: your variables should always have one type only.
var text = 'Hello types';
// This is wrong! Don't do it!
text = 1;
The above example shows the problem. This rule requires us to pretend that the last line of code in this example will throw an error, because when we first defined the variable text
, we gave it a value of type string
and now we’re assigning a number
to it. The rule of consistent types means we’re not allowed to change a variable’s type like that.
It’s easier to reason about your code when your variables are consistent. It helps especially in longer functions, where it’s easy to lose sight of where the variables come from. I’ve accidentally caused bugs many times when working in codebases that didn’t respect this rule, because I saw a variable being declared, and then assumed it would keep the same type – because let’s face it, that makes sense doesn’t it? Usually there is no reason to assign a different type into the same variable.
Types in Function Parameters
The same rule applies here. The parameters for functions should also be consistent. An example of doing it wrong:
function sum(a, b) {
if (typeof a === 'string') {
a = 1;
}
return a + b;
}
What’s wrong with this? It’s generally considered bad practice to branch logic based on a type check. There are exceptions to this, but usually it would be a better option to use polymorphism.
You should aim to make sure your function parameters also have only one type. It reduces the possibility of problems if you forget to account for the different types, and leads to simpler code because you don’t have to write code to handle all the different cases with types. A better way to write the sum
function would be as follows:
function sum(a, b) {
return a + b;
}
Then, you handle the type check in the calling code instead of in the function. As you can see from the above, the function is now much simpler. Even if we have to move the type check to somewhere else, the earlier we can do them in our code, the better off we’ll be.
We will discuss the use of type checking and typeof
later in the article, including how type checks can easily cascade if used poorly.
Types in Function Return Values
This ties in with the two others: Your functions should always return values of the same type.
We can take an example from AngularJS here. AngularJS provides a function to lowercase text, called angular.lowercase
. There’s also a standard function for it, String.prototype.toLowerCase
. We can compare their behavior to understand this part of the rule better:
var a = angular.lowercase('Hello Types');
var b = angular.lowercase(null);
The variable a
will contain what you would expect: 'hello types'
. However, what will b
contain? Will it be an empty string? Will the function throw an exception? Or maybe it’s just going to be null
? In this case, the value of b
is null
. Notice how it was immediately difficult to guess what the result was going to be – we had three possible outcomes right off the bat. In the case of the Angular function, for non-string values, it will always return the input.
Now, let’s see how the built-in one behaves:
var a = String.prototype.toLowerCase.call('Hello Types');
var b = String.prototype.toLowerCase.call(null);
The result of the first call is the same, but the second call throws an exception. The built-in function follows the rule of consistent types, and it does not allow incorrect parameter types. The returned value is also always a string. So we can say the built-in function is better, but you might be wondering how exactly?
Let’s consider a typical use-case for a function like this. We’re using it at some point in our code to convert strings into lowercase. As is often the case in JavaScript code, we’re not 100% sure that our input is always going to be a string. It doesn’t matter, as because we’re good programmers, we’re assuming our code doesn’t have any bugs.
What will happen if we’re using the function from AngularJS which doesn’t respect these rules? A non-string value goes through it without any problems. It might go through a couple more functions, maybe we’ll even send it through an XMLHttpRequest
call. Now the wrong value is in our server and it ends up in the database. You can see where I’m going with this, right?
If we had used the built-in function, which respects the rules, we would immediately spot the bug right then and there.
Whenever you write a function, make sure the types it returns are consistent. A bad example is shown below:
function foo(a) {
if(a === 'foo') {
return 'bar';
}
return false;
}
Again, same as with variables and parameters, if we have a function like this, we can’t make assumptions about its behavior. We will need to use an if
to check the type of the returned value. We might forget about it at some point, and then we have another bug in our hands. We can rewrite it in many ways, here’s one way which fixes the issue:
function foo(a) {
if(a === 'foo') {
return 'bar';
}
return '';
}
This time we’ve made sure all the paths return a string. It’s much easier to reason about the function’s result now.
null
and undefined
are Special
So far we’ve really just talked about the primitive types. When it comes to objects and arrays, you should follow the same rules, but there are two special cases to keep in mind.
When dealing with reference types, you sometimes need to indicate that there is no value. A good example of this is document.getElementById
. If it doesn’t find a matching element, it will return null
.
This is why we will consider null
to share the type with any object or array, but only those. You should avoid returning null
from a function which may otherwise return a primitive value like Number
.
undefined
can be also considered a “no value” for references. For most purposes, it can be treated as equal to null
, but null
is preferred because of its semantics in other object-oriented languages.
Arrays and null
When working with arrays, you should also consider that an empty array is often a better choice than null
. Although arrays are reference types and you can use null
with them, it usually makes more sense to return an empty array. Let’s look at the following example:
var list = getListOfItems();
for(var i = 0; i < list.length; i++) {
//do something
}
This is probably one of the most common styles of usage for arrays. You get an array from a function, and then you iterate over it to do something else. What would happen in the above code if getListOfItems
returned a null
when there are no items? It would throw an error, because null
does not have length
(or any other property for that matter). When you consider the typical usage of arrays like this, or even list.forEach
or list.map
, you can see how it’s generally a good idea to return an empty array when there are no values.
Type Checking and Type Conversion
Let’s look at type checking and type conversion in more detail. When should you do type checks? When should you do type conversion?
Type Conversion
The first goal with type conversion should be to make sure your values are of the correct type. Numeric values should be Number
s and not String
s and so on. The second goal should be that you only need to convert a value once.
The best place to do type conversion is at the source. For example, if you’re fetching data from the server, you should do any necessary type conversion in the function which handles the received data.
Parsing data from the DOM is a very common example of where things start to go wrong. Let’s say you have a textbox which contains a number, and you want to read it. Or, it could just be an attribute in some HTML element, it doesn’t even have to be user input.
//This is always going to be a string
var num = numberInput.value;
//This is also always a string
var num2 = myElement.getAttribute('numericAttribute');
Since values that you can get from DOM are often strings, it’s important to do type conversion when reading them. In a way, you can think of it as the “edge” of your module. The data is entering your JavaScript module through this function which is reading it, therefore it has to convert the data into the correct format.
By doing type conversion at the edges of our module, we ensure that the internals don’t have to deal with it. This reduces the likelihood of bugs being caused by implicit type coercion by a large margin. It also allows us to write less code because we don’t let bad values to get into the module from the edges.
//We can parse ints and floats like so
var num = parseInt(numberInput.value, 10);
var num2 = parseFloat(myElement.getAttribute('numericAttribute'));
//But if you need to convert a string to a boolean, you need to do a string comparison
var bool = booleanString === 'true';
typeof
and Type Checks
You should only use typeof
for validation, not branching logic based on type. There are exceptions to this, but it’s a good rule of thumb to follow.
Let’s look at two examples for this:
function good(a) {
if(typeof a !== 'number') {
throw new TypeError('a must be a number');
}
//do something
}
This is an example of using typeof
for validation. We’re ensuring that the parameter given to the function is of the correct type. However, the following example shows what it means to branch logic by type.
function bad(a) {
if(typeof a === 'number') {
//do something
}
else if(typeof a === 'string') {
//do something
}
else if(typeof a === 'boolean') {
//do something
}
}
Don’t do this. Although it can sometimes be necessary, it’s usually a sign of poor design. If you find yourself doing this kind of logic a lot, you probably should have converted the value earlier in the code into the correct type.
If you end up with a lot of typeof
s in your code, it can be a sign that you might need to convert the value you’re comparing against. It’s typical for type checks to spread out, and that is often a good sign of poor design with regards to types.
As mentioned earlier, you should try to do type conversions at the edges of your module, as it allows you to avoid the typeof
cascade. If you do your conversion early on, none of the functions that are called after it have to do typechecks or type conversions.
This also applies to objects: If you find yourself doing a lot of checks using instanceof
or checking if a property on an object exists, it’s a sign that perhaps you should structure the data differently.
The same rule applies to instanceof
as typeof
: You should try to avoid it, as it can be a sign of poor design. There is one case where it’s unavoidable though:
try {
// some code that throws exceptions
} catch(ex) {
if (ex instanceof TypeError) {
} else if (ex instanceof OtherError) {
}
}
If your code requires specific handling for exception types, instanceof
is often a decent choice, since JavaScript catch
doesn’t allow to differentiate by type like it does in some other languages. In most other cases, you should try to avoid instanceof
.
Conclusion
As we’ve discovered, we get great freedom with JavaScript’s weak typing, but we also must take care to think before we act. Otherwise, we’ll end up in a massive mess of types where nothing makes sense.
By making sure our code follows the rule of consistent types, we save ourselves a lot of trouble. It’s much easier to reason about our code when we know the types. We don’t have to build a lot of type checks into our code just to guard against errors.
This might seem difficult if you haven’t used languages with strong typing, but it pays back greatly when you need to debug or maintain the code.
For further reading on the topic, I would recommend taking a look at TypeScript. It’s a language similar to JavaScript, but it adds stronger typing semantics to the language. It also has a compiler which will spit out errors when you try to do something silly, like mix and match types.