Thinking Outside the DOM: Composed Validators and Data Collection

In part 1 of this mini-series, we discussed a problem common to many JavaScript code bases: tightly coupled code. Then, I introduced you to the benefits of separating orthogonal concerns. As a proof of concept, we started the development of a form validation system that is not restricted to forms, and can even work outside the DOM altogether.

In this second and last part, we’ll discuss composed validators, how to collect data from a form, and how to report errors. Finally, I’ll provide you a link to the GitHub repository containing all the code developed in this mini-series.

Composed Validators

In the previous article we developed a system for validating individual fields. Validating fields one-by-one with one rule at a time is fine and dandy, but there are many cases that require some more thought. You can validate an email address with one insanely long regular expression, but doing so will only enable you to tell your users whether the email is acceptable or not. A better approach is to validate several parts of the email address separately and provide a targeted email validation error.

This is possible with the current design:

var rules = [
  pattern('email', /@/, 'Your email is missing an @'),
  pattern('email', /^\S+@/, 'Please enter the username in your email address',
  // ...
];

While this will work, it may produce multiple error messages for the email address. It also requires us to manually repeat each step for each field that has email semantics. Even if we haven’t discussed error message rendering yet, it would be nice to have an abstraction to group multiple validators in a way that only shows the result of the first violated rule. As it turns out, this is the exact semantics of the && operator. Enter the and validator. This validator will take multiple validators as its arguments, and applies them all until it finds a failing one:

function and() {
  var rules = arguments;

  return function (data) {
    var result, l = rules.length;

    for (var i = 0; i < l; ++i) {
      result = rules[i](data);
      if (result) {
        return result;
      }
    }
  };
}

Now we can express our email validator in a way so that only one error message will bubble up at a time:

var rules = [and(
  pattern('email', /@/, 'Your email is missing an @'),
  pattern('email', /^\S+@/, 'Please enter the username in your email address',
  // ...
)];

This can then be codified as a separate validator:

function email(id, messages) {
  return and(
    pattern('email', /@/, messages.missingAt),
    pattern('email', /^\S+@/, messages.missingUser)
    // ...
  );
}

While we’re on the topic of email addresses, one error people keep making where I live is to type Hotmail and Gmail addresses with our national top level domain (e.g. “…@hotmail.no”). It would be very helpful to be able to alert the user when this happens. To phrase this differently: sometimes we want to perform certain checks only when certain criteria are met. To solve this, we will introduce the when function:

function when(pred, rule) {
  return function (data) {
    if (pred(data)) {
      return rule(data);
    }
  };
}

As you can see, when is a validator, just like required. You call it with a predicate (a function that will receive the data to be validated) and a validator. If the predicate function returns true, we evaluate the validator. Otherwise, when is deemed successful.

The predicate we need to solve our Hotmail conundrum is one that checks that the value matches a pattern:

function matches(id, re) {
  return function (data) {
    return re.test(data[id]);
  };
}

This is pretty close to our pattern validator, except this isn’t a validator. It’s also worth noting how small most of these functions are, and how they really shine when composed together, rather than when they’re used on their own. With this final piece of the puzzle, we can create an email validator that will be really useful to the end user:

function email(id, messages) {
  return and(
    pattern(id, /@/, messages.missingAt),
    pattern(id, /^\S+@/, messages.missingUser),
    pattern(id, /@\S+$/, messages.missingDomain),
    pattern(id, /@\S+\.\S+$/, messages.missingTLD),
    when(matches(id, /@hotmail\.[^\.]+$/),
      pattern(id, /@hotmail\.com$/, messages.almostHotmail)
    ),
    when(matches(id, /@gmail\.[^\.]+$/),
      pattern(id, /@gmail\.com$/, messages.almostGmail)
    )
  );
}

It can be used like so:

email('email', {
  missingAt: 'Missing @',
  missingUser: 'You need something in front of the @',
  missingDomain: 'You need something after the @',
  missingTLD: 'Did you forget .com or something similar?',
  almostHotmail: 'Did you mean hotmail<strong>.com</strong>?',
  almostGmail: 'Did you mean gmail<strong>.com</strong>?'
});

In case you want to play with this function, I’ve created a CodePen just for you.

Extracting data

Now that we can validate data, we will also need to get data from a form in order to solve our initial problem of form validation. Basically, we need to turn this:

<form action="/doit" novalidate>
  <label for="email">
    Email
    <input type="email" name="email" id="email" value="christian@cjohansen.no">
  </label>
  <label for="password">
    Password
    <input type="password" name="password" id="password">
  </label>
  <label class="faded hide-lt-pad">
    <input type="checkbox" name="remember" value="1" checked>
    Remember me
  </label>
  <button type="submit">Login</button>
</form>

Into this:

{
  email: 'christian@cjohansen.no',
  password: '',
  remember: '1'
}

Implementing this in steps with tests is fairly straight forward, but it will require DOM elements. The following is an example of what these tests look like:

describe('extractData', function () {
  it('fetches data out of a form', function () {
    var form = document.createElement('form');
    var input = document.createElement('input');
    input.type = 'text';
    input.name = 'phoneNumber';
    input.value = '+47 998 87 766';
    form.appendChild(input);

    assert.deepEqual(extractData(form), {'phoneNumber': '+47 998 87 766'});
  });
});

This isn’t all that bad, and with another small abstraction we can tighten it up a little:

it('fetches data out of a form', function () {
  var form = document.createElement('form');
  addElement(
    form,
    'input',
    {type: 'text', name: 'phoneNumber', value: '+47 998 87 766'}
  );

  assert.deepEqual(extractData(form), {'phoneNumber': '+47 998 87 766'});
});

Extracting the data is a matter of selecting all the input, select, and textarea elements in a form, and extracting their name property and their current value. Some special handling is necessary to extract the correct value from check boxes and radio buttons. The main function looks like this:

function extractData(form) {
  return getInputs(form).reduce(function (data, el) {
    var val = getValue[el.tagName.toLowerCase()](el);
    if (val) { data[el.name] = val.trim(); }
    return data;
  }, {});
};

As you can see from this snippet, the extractData() function relies on a getInputs() function. The aim of this support function to obtain an array of DOM elements of the form passed as argument. In this article I’m not going to cover it because this function relies on other small functions and I want to avoid the Inception effect. However, if you want to dig more, you can take a look at the GitHub repository I created that contains all the files from the previous installment and this installment.

Let’s now have a look at how we can report the errors.

Error Reporting

To report errors, we can design a function that accepts a form and an array of errors. However, there is one challenge to solve: in order to avoid duplicate errors in the DOM, the function either needs to keep state, so it knows what errors it already rendered or it needs to assume every error in the form can be wiped when a new set is rendered. Which solution is suitable will depend on your specific use cases.

I will not dive into the details of the rendering implementation, but suggest the following simplified solution:

function renderErrors(form, errors) {
  removeErrors(form);
  errors.forEach(function (error) {
    renderError(form, error);
  });
}

To render an error, we find the input it relates to, and insert an element right before it. We only render the first error. This is a very basic rendering strategy but works well:

function renderError(form, error) {
  var input = form.querySelector("[name=" + error.id + "]");
  var el = document.createElement("div");
  el.className = "error js-validation-error";
  el.innerHTML = error.messages[0];
  input.parentNode.insertBefore(el, input);
}

In the code above, you can see that I’m assigning two classes to the element: error and js-validation-error. The former is intended for styling purposes only. The latter is intended as an internal mechanism, used by the following removeErrors() function:

function removeErrors(form) {
  var errors = form.querySelectorAll(".js-validation-error");

  for (var i = 0, l = errors.length; i < l; ++i) {
    errors[i].parentNode.removeChild(errors[i]);
  }
}

A basic demonstration of the errors reporting system we’ve built in this section is shown by this CodePen.

Wiring It All Together

We now have (one version of) all the pieces: reading from the DOM, validating pure data, and rendering validation results back into the DOM. All we need now is a high-level interface to bind them all together:

validateForm(myForm, [
  required("login", "Please choose a login"),
  email("email", i18n.validation.emailFormat),
  confirmation("password", "password-confirmation", "Passwords don't match")
], {
  success: function (e) {
    alert("Congratulations, it's all correct!");
  }
});

As with the rendering, this high-level wiring can be both stupid simple or rather sophisticated. In the project where much of this code originated, the validateForm() function would not perform validation until the user tried to submit the form the first time. If there were validation errors, it would enter a sort of “smart live validation mode”: errors that were fixed would be removed as quickly as possible (e.g. on keyup), but new ones would only be added on blur. This model struck a good balance between instant feedback and nagging (no one likes to hear that “your email is incorrect” before they even finished typing).

Now that I’ve complete the description of this last piece, I invite you to take a look at the demo included in the GitHub repository. It includes all the code we’ve discussed fully fleshed out, and full test cases.

Conclusion

The strength of this model lies in how the external input/output mechanisms are thoroughly decoupled from the “rules” implementation, which really is the heart of the library. This model could easily be used for other kinds of data validation. The rules engine could also possibly be extended to include information about successfully correcting errors as well (e.g. by returning something like {id: 'name', ok: true}, or with more details) to allow for green checkmarks next to successfully completed elements. Maybe it would also make sense to allow the rules engine to deal with asynchronous operations.

The two last components, the renderer and the validateForm() function contain the functionality that usually sets various validation libraries apart. It would be trivial to put some more work into making them more flexible, or even provide alternative implementations for use in different parts of the application, or across applications. This means that the engine that contain all the validation logic can remain very stable, and the less code that needs frequent changes, the less chance of introducing new bugs.