The Dark Shadow of The DOM

UPDATE 2015.03.17: The accessibility concerns I expressed in this article are incorrect, and were based on misunderstanding. In fact, there are no such accessibility issues with Shadow DOM and screenreaders

Shadow DOM is part of the Web Components specification, and is designed to address the encapsulation problems that plague some kinds of web development.

You know the kind of thing — if you build a custom widget, how do you avoid naming conflicts with other content on the same page? Most significantly, how do you prevent the page’s CSS from affecting your widget?

It’s easy if you control the whole page, but that’s often not the case — not if you’re making widgets for other people to use. The problem there is that you have no idea what other CSS will be present. You can certainly reduce the likelihood of such problems, by defining all your selectors as descendants from something with high specificity:

#mywidget > .mywidget-container
{
}
#mywidget > .mywidget-container > .mywidget-inner
{
}

But that’s only effective until the site defines a rule with two ID selectors. Or maybe you could use two, but then three come along!

Recently, I’ve been toying with the idea of defining dynamic selectors — the widget script traverses up the DOM and makes a note of every element ID between itself and the document root, then compiles selectors that include all those IDs.

But even that’s not guaranteed. There’s really nothing we can do to entirely prevent this problem, except to use an <iframe>, but that’s not a good solution — iframes limit the size and shape of the widget, they make an additional server request, and they create a keyboard trap in some browsers (e.g. Opera 12, in which you can’t Tab out of an iframe once you’ve Tabbed into it). So for all those reasons, iframes are best avoided.

Into The Shadow

The Shadow DOM aims to solve this issue. I won’t going into the details of how it works and how to use it (there are other articles that do that), but for the purposes of this article I’ll summarize it like this — the Shadow DOM encapsulates content by creating document fragments. Effectively, the content of a Shadow DOM is a different document, which is merged with the main document to create the overall rendered output.

In fact some browsers already use this to render some of their native widgets. If you open the Developer Tools in Chrome, select Show Shadow DOM from the settings panel (the cog icon bottom-right) and then inspect a "range" input, you’ll see something like this:

<input type="range">
  #document-fragment
    <div>
      <div pseudo="-webkit-slider-runnable-track">
        <div></div>
      </div>
    </div>
</input>

But you can’t get to those elements through the DOM, because they’re hidden from it:

alert(input.firstChild);		//alerts null

The shadow content is roughly analogous to an iframe document on a different domain — the DOM can see the iframe, but can’t see anything inside it.

So because it’s isolated, users can’t accidentally break it, there’s no possibility of naming conflicts with any classes or IDs you use, and the CSS on the main page won’t affect it at all.

Sounds brilliant, doesn’t it?

Into The Darkness

But hang on … if all that content is not in the DOM, then doesn’t that mean it’s not exposed to accessibility APIs either?

Yes, that’s exactly what it means.

Anything you put in a Shadow DOM is inaccessible to browser-based access technologies, such as screenreaders. It’s not available to search-engines either, but that’s always the case with scripted content. However screenreaders are different — they are script-capable devices — and so they do have access to scripted content.

But not this content!

Of course the specification is not ignorant of this division. In essence, it assumes a distinction between elements that contain text-content or informational attributes, and those that are simply empty boxes to create visual parts, like the "range" input’s thumb. Let’s refer to these as content elements and utility elements.

So how often do widgets have such a clear distinction between the two? For the "range" input example it’s obvious, but are all sliders built that way? I wrote a slider widget recently, for an accessible video player, and its markup looked like this:

<label for="slider-thumb">
  <button type="button" id="slider-thumb" 
    role="slider" aria-orientation="horizontal"
    aria-valuemin="0" aria-valuemax="120" 
    aria-valuenow="75" aria-valuetext="Time: 01:15">
    <span></span>
  </button>
</label>

The only part of that slider which could be put inside a Shadow DOM, is the <span> inside the <button>. The <button> itself is important content, with ARIA attributes that provide dynamic information to screenreaders and other access technologies.

To make that work with Shadow DOM we’d have to move all the ARIA attributes to the outer <label>, give it tabindex, and then use Shadow DOM for the inner elements. But that would be less accessible because we’d lose native semantics (e.g. the label’s for attribute no longer makes a valid association), and it would be less useful because it means the widget can’t submit any form data (so we’d need a separate form control, such as a hidden input).

But even if that were fine — and even if every widget we make has a clear and easy distinction between content and utility elements — the content part of the widget is still not encapsulated; it’s still vulnerable to naming-conflicts and unwanted CSS inheritance.

And we all know that some people won’t understand or respect that distinction anyway. People will use Shadow DOM for content, and use it to produce a whole new generation of inaccessible web applications.

I read a number of other articles about Shadow DOM in researching this one, and they all do the same thing — they all stop to make the point that you shouldn’t put content in a Shadow DOM, and then immediately afterwards they say, but let’s not worry about that.

Brilliant! A whole group of users dismissed in one idle caveat!

But let’s be kinder, hey. Let’s say that article examples can’t be judged in those terms. Let’s assume that everybody who uses Shadow DOM will do so with appropriate consideration, making sure they only use it for utility elements, not for content.

With that requirement, Shadow DOM only provides half a solution; and half a solution is no solution at all.

Into The Light

It seems to me that the entire concept of Shadow DOM is wrong. It’s an over-engineered approach that doesn’t really solve the problem, and any approach that uses document fragments will have the same flaw — as long as it’s necessary to differentiate between accessible and non-accessible elements.

What we really need is the conceptual opposite — a way of defining style-encapsulated subtrees which are still part of the document.

In other words, rather than having multiple documents that only the browser can traverse, we have a single document that only the browser treats as multiple documents.

This could be expressed with a simple element attribute:

<div encapsulated="encapsulated">

The HTML DOM would interpret that no differently — it’s just an element with a non-rendered attribute, same as any other. However the CSS DOM would interpret it as a kind of document fragment, effectively saying that the element and everything inside it does not inherit from higher scopes.

And we can already do the opposite — scoping styles to a subtree — either by using descendant selectors, or if you really must, using <style scoped> (although personally I’d avoid that until it’s available as a <link> attribute, because <style> elements undermine the separation of content and presentation).

To go with that encapsulated attribute, we could still use a better way to manage and template utility elements, but HTML is the wrong place to do that. Really, we shouldn’t have to define empty elements at all — they’re a functional necessity only because we have no other way of defining presentational subtrees — so that capability should be added to CSS.

In other words, it should be possible for a single element to define any number of pseudo-elements, and for pseudo-elements themselves to also define pseudo-elements. Something like this:

#mywidget::after
{
}
#mywidget::after + ::element
{
}
#mywidget::after > ::element
{
}
#mywidget::after > ::element + ::element
{
}

Which would create a virtual subtree like this:

<div id="mywidget" encapsulated="encapsulated">
  Text content
  <after>
    <element></element>
    <element></element>
  </after>
  <element></element>
</div>

Defining that stuff in CSS would imply a clear and innate distinction, that no developer could fail to understand — content goes in HTML, presentation in CSS, just the way it should be.

It remains to be seen whether we’ll ever get anything like what I’m suggesting. But in the meantime, I can only urge you to remember the absolute distinction — don’t use Shadow DOM for anything except empty elements which convey no information. And if you want my best suggestion, don’t bother with it at all.