Feb 25, 2013

MathML in WebKit

I’ve had the opportunity recently to dig into the WebKit MathML code. In the process, I learned a lot about the requirements of how MathML layout differ from the requirements for HTML layout.

CSS, How does it layout?

At a high level, CSS layout is a fairly simple (mostly) top-down, recursive algorithm. To layout a given node, you do the following:

  1. Compute your width. Your width usually doesn’t depend on the sizes of your children, so you can compute this without laying out your children.

  2. Layout your children using the width computed in step 1.

  3. Compute your height. Unlike width, height often depends on the size of your children, so you need to lay them out first. Consider the default of height:auto on a display:block node. Its height is the sum of the heights of its children.

Each step depends on the output of the previous step. But, sometimes a node’s width does depend on its children’s width, for example, if the node is display:inline-block, absolutely positioned or floated. Here’s an example:

How does this size?
<div style="display: inline-block; border: 5px solid salmon">
    <div style="background-color: lightsalmon">How does this size?</div>
</div>

There’s a cycle here. The parent’s width depends on its children’s width, whose width in turn depends on the parent’s width. CSS plays a clever trick to break this cycle by inventing a concept of intrinsic widths. Just to mess with you, CSS2.1 calls these preferred widths, and CSS3 calls them min-content measure and max-content measure. The basic concept is the following:

min-content measure: The width of the longest line taking all possible line breaks.
max-content measure: The width of the longest line taking only hard line breaks.

This part of CSS layout is bottom-up (i.e. you compute your children’s min-content/max-content measure in order to determine your own), but it does not require a full layout of your children. You only need to layout text runs, which which have an intrinsic size (i.e. the size of the actual characters). We’ll need to layout all the text runs later in order to full layout anyways, so it’s not wasted work.

With our min-content/max-content measures in hand, we can do some simple math to figure out the shrink-wrapped width of the parent (step 1) without actually laying out the children (step 2). Sometimes this leads to surprising results, but, for the most part, it pretty much does what you expect.

For a more general introduction of how layout works in WebKit, I recommend starting with Bem’s great post.

So how’s MathML different?

Unlike HTML layout, MathML layout is a bottom-up algorithm. Consider the following equation:

x = (b + sqrt(b^2 – 4ac)) + 2

To get good MathML layout of the parentheses around the equation, you have to know the height of the contents of the parentheses. Once you know the height of the contents, you can grow the parentheses both vertically and horizontally to look how you’d expect. Look at the examples on the Mozilla MathML torture test.

This is fundamentally at odds with HTML layout. The steps for MathML layout are something like:

  1. Layout your children.

  2. Determine their height.

  3. Stretch operators vertically and horizontally.

  4. Compute your width.

  5. Layout your children again now that your know your final width in case they layout differently at the new width.

  6. Compute your height.

Computing your min-content/max-content measure now requires that you first do a full layout of your children. Intrinsic sizes in CSS were specifically designed so that you could compute them without doing layout in order to break cycles where your size depends on your children’s size.

This isn’t fundamentally an unsolvable problem, but the current MathML layout code in WebKit is built on top of the CSS layout code. So, you have an algorithm that is bottom-up intertwining both above and below it with an algorithm that is top-down. This complexity leads to bugs and crashes. More importantly, it makes it harder to reason about what code will do. Hard to reason about code, is hard to maintain and hard to get right.

How should it work?

I believe the way to make MathML work well in WebKit and not have a significant complexity impact on the rest of the codebase would be to have it be an opaque root as far as the rest of the render tree is concerned, kind of (but not exactly) like SVG. MathML still does its bottom-up layout, but it wouldn’t be built on top of the CSS-based layout code (except it would still share the code used for laying out text runs, which is probably the most complex part of layout anyways).

To simplify this further, I think we could limit the CSS that’s allowed on MathML content. There’s precedent for this both in the web platform and in the WebKit code. For example, some pseudo-elements limit the CSS you can apply. For example, the use-cases for doing absolute positioning or applying styling to a first-letter pseudo-element inside a MathML block do not seem worth the complexity. This would greatly simplify the complexity of MathML code without hindering the sorts of things real-world content would need.

The WebKit render tree code does a lot that MathML does not, and will never need to do. As such, I'm not even convinced that it would be more code to do it this way. There may be some code duplication or refactoring that needs to be done in order to avoid this duplication (e.g. for hit testing). Even if it were two or three times as much code in the end, it would work better and it would keep the codebase maintainable. This is all fallout from the way that the width of operators depends on the height of their contents, breaking a fundamental assumption of CSS-based layout.

Why not just embed TeX?

If we’re treating MathML like an opaque root, it's essentially like embedding TeX in a web page. One advantage of MathML is that it has a DOM that is exposed to JavaScript and CSS. If we were to embed TeX, we’d almost certainly eventually need to design and expose an object model, e.g. for editing equations. At that point, you’re better off starting with something designed to fit well in the web content model.

Unfortunately, the implications of this approach are that much of the current WebKit code cannot be used and that there's a lot of work to do in order to get a high-quality MathML implementation in WebKit. I'd like to see MathML in Chromium someday, but there's a good deal of rendering work that I think is higher priority. In the meantime, there are great libraries like MathJax that will render MathML markup for you. If someone was interested in trying this opaque root approach, I'd recommend starting out on a GitHub fork of WebKit. Once you have the basic scaffolding in place, you can find me on the #chromium irc channel and I can help make sure you’re heading in the right direction.

Head over to the G+ post for this to leave comments.