Java - - By Marcello La Rocca

How Optional Breaks the Monad Laws and Why It Matters

Java 8 brought us lambdas and streams, both long-awaited-for features. With it came Optional to avoid NullPointerExceptions at the end of stream pipelines that might not return an element. In other languages may-or-may-not-contain-a-value types like Optional are well-behaving monads – but in Java it isn’t. And this matters to everyday developers like us!

Introducing java.util.Optional

Optional<T> is described as “a container object which may or may not contain a non-null value”. That summarizes pretty well what you should expect from it.

It has some useful methods like:

  • of(x), that allows creating an Optional container wrapping on a value x.
  • isPresent(), that returns true if and only if the containPer object does contain a non-null value.

Plus some slightly less useful (or slightly dangerous if you will) ones like get(), that returns the contained value if it’s present, and throws an exception when called on an empty Optional.

There are other methods that behave differently depending on the presence of a value:

  • orElse(v) returns the contained value if there is one, or v by default if the container is empty.
  • ifPresent executes a block of code if and only if there is a value.

Curiously enough, you can see that in its class description there is no mention of methods like map, flatMap, or even filter. All of them can be used to further process the value wrapped by the Optional. (Unless it is empty: Then the functions aren’t called and the Optional stays empty). Their omission might have to do with the fact that in the intentions of the library creators, Optional should not have been a monad.

A Step Back: Monads

Yikes! I can picture some of you sneering when you read that name: monad.

For those who didn’t have the pleasure yet, I’ll try to summarize an introduction to this elusive concept. Be advised and take the following lines with a grain of salt! To go with Douglas Crockford’s definition, monads are “something that once developers really manage to understand, instantly lose the ability to explain to anybody else”.

We can define a monad as:

  • A parameterized type M<T>: in Java terms, public class M<T>.

  • A unit function, which is a factory function to make a monad out of an element: public <T> M<T> unit(T element).

  • A bind operation, a method that takes a monad as well as a function mapping an element to a monad, and returns the result of applying that function to the value wrapped in the monad:

    public static <T, U> M<U> bind(M<T> monad, Function<T, M<U>> f) {
        return f.apply(monad.wrappedValue());
    }
    

Is that all there is to know about monads? Not really, but that is enough for now. Feel free to check the suggested readings at the end of the article if you’d like to read more on the subject.

Is Optional a Monad?

Yes and no. Almost. Definitely maybe.

Optional per se qualifies as a monad, despite some resistence in the Java 8 library team. Let’s see how it fits the 3 properties above:

  • M<T> is Optional<T>.
  • The unit function is Optional.ofNullable.
  • The bind operation is Optional.flatMap.

So it would seem that Optional is indeed a monad, right? Not so fast.

Monad Laws

Any class, to truly be a monad, is required to obey 3 laws:

  1. Left identity, applying the unit function to a value and then binding the resulting monad to function f is the same as calling f on the same value: let f be a function returning a monad, then bind(unit(value), f) === f(value).
  2. Right identity, binding the unit function to a monad doesn’t change the monad: let m be a monadic value (an instance of M<T>), then bind(m, unit) === m.
  3. Associativity, if we have a chain of monadic function applications, it doesn’t matter how they are nested: bind(bind(m, f), g) === bind(m, x -> g(f(x))).

Both left and right identity guarantee that applying a monad to a value will just wrap it: the value won’t change nor monad will be altered. The last law guarantees that monadic composition is associative. All laws together make code more resilient, preventing counter-intuitive program behaviour that depends on how and when you create a monad and how and in which order you compose functions that you will use to map a monad.

Optional and Monad Laws

Now, as you can imagine, the question is: Does Optional<T> have these properties?

Let’s find out by checking property 1, Left Identity:

Function<Integer, Optional<Integer>> f = x -> {
    if (x == null) {
        x = -1;
    } else if (x == 2) {
        x = null;
    } else {
        x = x + 1;
    }
    return Optional.ofNullable(x);
};
// true, Optional[2] === Optional[2]
Optional.of(1).flatMap(f).equals(f.apply(1));
// true, Optional.empty === Optional.empty
Optional.of(2).flatMap(f).equals(f.apply(2));

This works both for empty and non-empty results. What about feeding both sides with null?

// false
Optional.ofNullable((Integer) null).flatMap(f).equals(f.apply(null));

This is somehow unexpected. Let’s see what happens:

// prints "Optional.empty"
System.out.println(Optional.ofNullable((Integer) null).flatMap(f));     
// prints "Optional[-1]"
System.out.println(f.apply(null));

So, all in all, is Optional a monad or not? Strictly speaking it’s not a well-behaving monad, since it doesn’t abide by the monad laws. However, since it does satisfy the definition of a monad, it could be considered one, although one with some buggy methods.

Optional::map and Associativity Law

If you think we got out of luck with flatMap, wait to see what happens with map.

When we are using Optional.map, null is also mapped into Optional.empty. Suppose we map again the result of the first mapping into another function. Then that second function won’t be called at all when the first one returns null. If, instead, we map the initial Optional into the composition of the two functions, the result would be quite different. Check out this example to clarify:

Function<Integer, Integer> f = x -> (x % 2 == 0) ? null : x;
Function<Integer, String > g = y -> y == null ? "no value" : y.toString();

Optional<Integer> opt = Optional.of(2);  // A value that f maps to null - this breaks .map

opt.map(f).map(g);                      // Optional.empty
opt.map(f.andThen(g));      // "no value"

By composing the functions f and g (using the handy Function::andThen) we get a different result than we got when applying them one by one. An even more obvious example is when the first function returns null and the second throws a NullPointerException if the argument is null. Then, the repeated map works fine because the second method is never called but the composition throws the exception.

So, Optional::map breaks the associativity law. This is even worse than flatMap breaking the left identity law (we’ll get back to it in the next section).

orElse to the Rescue?

You might think it could get better if we use orElse. Except it doesn’t.

It is easy to create a chain with more than two functions, where getting null at different stages can lead to different results. Unfortunately we don’t have a way, at the end of the chain, to tell where null was first handled. And so no way to provide the right result when orElse is applied. More in abstract, and even worse, by using orElse we would be relying on the fact that every developer maintaining our code, or every client using our library, will stick to our choices and keep using orElse.

What’s The Catch with Optional?

The problem is that by design non-empty Optionals can’t hold null. You might legitimately object it is designed to get rid of null, after all: And in fact Optional.of(null) will throw a NullPointerException. Of course null values are still common, so ofNullable was introduced to keep us from repeating the same if-null-then-empty-else-of check all over our code. However – and here is the essence of all evil – Optional.ofNullable(null) is translated to Optional.empty.

The net result is that, as shown above, the following two situations can lead to different results:

  • Applying a function before wrapping a value into Optional;
  • Wrapping the value into an Optional first and then mapping it into the same function.

This is as bad as it sounds: it means that the order in which we apply functions matters. When we use map, as we saw, it gets even worse, because we lose associativity invariance as well and even the way functions are composed matters.

In turn, these issues make adding bugs during refactoring not just possible, but even frighteningly easy.

A Real World Example

Now, this might look as an example built ad-hoc to cause trouble. It’s not. Just replace f with Map::get (which returns null when the specified key is not contained in the map or if it is mapped to the value null) and g with any function that is supposed to handle and transform null.

Here is an example closer to real world applications. First, let’s define a few utility classes:

  • Account, modeling a bank account with an ID and a balance;
  • Balance, modeling an (amount, currency) pair;
  • Currency, an enum gathering a few constants for the most common currencies.

You can find the full code for this example on GitHub. To be clear, this is not, by any means, to be intended as a proper design for this classes: We are greatly simplifying, just to make the example cleaner and easy to present.

Now let’s say that a bank is represented as a collection of accounts, stored in a Map, linking account IDs to instances. Let’s also define a few utility functions to retrieve an account’s balance in USD starting from the account’s ID.

Map<Long, Account> bank = new HashMap<>();

Function<Long, Account> findAccount = id -> bank.get(id);
Function<Account, Balance> extractBalance = account -> account != null
        ? account.getBalance()
        : new Balance(0., Currency.DOLLAR);
Function<Balance, Double> toDollars = balance -> {
    if (balance == null) {
            return 0.;
    }
    switch (balance.getCurrency()) {
        case DOLLAR: return balance.getAmount();
        case POUND: return balance.getAmount() * 1.3;
        case EURO: return balance.getAmount() * 1.1;
        default: return 0.;
    }
};

We are now ready to see where individual mapping of our three functions works differently from their composition. Let’s consider a few different cases, where we start from an account’s id wrapped in an Optional, and we map it to the dollar amount for that account.

Optional<Long> accountId3 = Optional.of(3L);
Optional<Long> accountId4 = Optional.of(4L);
Optional<Long> accountId5 = Optional.of(5L);

bank.put(4L, null);
bank.put(5L, new Account(5L, null));

Both for an empty Optional and for a non-empty one wrapping the ID of a non-null account with proper balance, it doesn’t matter whether we map our functions one by one or whether we use their composition, the output is the same. We omit these cases here for brevity, but feel free to check out the repo.

Let’s instead try out the case where an account’s ID is not present in the map:

accountId3.map(findAccount)
        .map(extractBalance)
        .map(toDollars)
        .ifPresent(System.out::println);  // Optional.empty
accountId3.map(findAccount
        .andThen(extractBalance)
        .andThen(toDollars))
        .ifPresent(System.out::println); // 0.0

In this case, findAccount returns null, which is mapped to Optional.empty. This means that when we map our functions individually, extractBalance will never be even called, so the final result will be Optional.empty.

If, on the other hand, we compose findAccount and extractBalance, the latter is called with null as its argument. Since the function “knows” how to handle null values, it produces a non-null output that will be correctly processed by toDollars down the chain.

So here we have two different results depending only on the way in which the same functions, taken in the same order, are applied to our input. Wow!

The same thing happens if we store null in the map for an ID or if the account’s balance is null, since toDollars is similarly crafted to handle null. Check out the repo for further details.

Practical Implications

Besides theoretical disputes on the nature of Optional, there are plenty of practical consequences of the fact that Optional::map and Optional::flatMap break the monad laws. This in turn prevents us from freely applying function composition, as we were supposed to have the same result if we apply two functions one after the other, or their composition directly.

It means that we can no longer refactor our code freely and be sure the result won’t change: Dire consequences might pop up not just in your code base, but – even worse – in your clients’ code. Before restructuring your code, you would need to know if the functions used anywhere in everybody’s code handle null or not, otherwise you might introduce bugs.

handcuffs

Possible Fixes

We have two main alternatives to try to make this right:

  • Don’t use Optional.
  • Live with it.

Let’s see each one of them in details.

Don’t Use Optional

Well, this looks like the ostrich algorithm mentioned by Tanenbaum to solve deadlock. Or ignoring JavaScript because of its flaws.

Some Java developers explicitly argue against Optional, in favour of keeping up with using null. You can certainly do so, if you don’t care about moving to a cleaner, less error-prone style of programming. TL;DR: Optional allows you to handle workflows where some input might or might not be present through monads, which means in a more modular and cleaner way.

Live with It

Optional breaks the monad laws under a couple of circumstances and there’s nothing to be done about it. But we can learn a couple of things from the way we took to come to this conclusion:

  • When starting a chain with a possibly null value and a function that returns an Optional, be aware that applying the function to the value can lead to a different result from first creating the Optional and then flat-mapping the function. This, in turn, can lead to the function not being called, which is a problem if you depend on its site effects.
  • When composing map chains, be aware that, while the individual functions were never called with null, the merged function might produce null as an intermediate result and pass it into non-nullsafe parts of the function, leading to exceptions.
  • When decomposing a single map into several, be aware that while the original function was executed as a whole, parts of it might now end up in functions that are never called if a previous part produced null. This is a problem if you depend on those parts’ site effects.

As a rule of thumb, prefer flatMap over map, as the former abides by the associativity law, while map doesn’t, and breaking associativity is far more error prone than breaking left identity. All in all it is best not to view Optional as a monad that promises easy composability but as a means to avoid having null pop up as the (flat-)mapped functions’ arguments.

I have worked out a few possible approaches to make code more robust, feel free to check them out on GitHub

Conclusions and Further Reading

We’ve seen that certain refactorings across flatMap and map chains can cause the code to change its behavior. The root cause is that Optional was designed to avoid stumbling into NPEs and in order to achieve that it transforms null to empty. This, in turn, means that such chains are not fully executed but end once an intermediary operation produces empty.

It is important to keep this in mind when relying on the site effects the potentially-not-executed functions cause.

If you’d like to read more about some of the topics in this article:

Sponsors