In Which Direction Do Selector Engines Read, Exactly

In which direction do selector engines read, exactly?


However I've read recently that most CSS selector engines read from right to left, in which case wouldn't the first example actually be slower?

Which way to CSS selector engines read in general? Left to right or right to left? And if they generally read right to left could someone please offer me an explanation as to why (I can't see how it makes sense to read right to left in terms of a selector engine)?

Frankly, it's nigh impossible to tell which selector will be slower in a given browser, much less across browsers. Performance tends to fluctuate and be unpredictable, especially at such microscopic scales and with unpredictable document structures. Even if we talk about theoretical performance, it ultimately depends on the implementation.

Having said that, as shown in Boris Zbarsky's answer to this other question and in Guffa's answer to yours, a typical browser (this is currently true of all major layout engines) takes an element and evaluates all the candidate selectors to see which ones it matches, rather than finding a set of elements that match a given selector. This is a subtle but very important difference. Boris offers a technical explanation that's not only incredibly detailed, but also authoritative (as he works on Gecko, the engine used by Firefox), so I highly suggest reading it.

But I thought I should address what seems to be another concern in your question:

As the selector engine would simply find every element with a class of name, and then have to identify which of those were divs?

As well as Patrick McElhaney's comment:

The linked question explains why selectors are read right-to-left in general, so #foo ul.round.fancy li.current is read li.current, ul.round.fancy, #foo, but is it really read right-to-left within each element (.current, li, .fancy, .round, ul, #foo)? Should it be?

I have never implemented CSS, nor have I seen how other browsers implement it. We do know from the answers linked above that browsers use right-to-left matching to walk across combinators within selectors, such as the > combinators in this example:

section > div.second > div.third

If an element isn't a div.third, then there is no point checking if its parent is a div.second whose parent is a section.

However, I don't believe that this right-to-left order drills all the way down to the simple selector level. In other words, I don't believe that browsers use right-to-left evaluation for each part of a simple selector sequence (also known as a compound selector) within the right-to-left evaluation across a series of compound selectors separated by combinators.

For example, consider this contrived and highly exaggerated selector:

div.name[data-foo="bar"]:nth-child(5):hover::after

Now, there's no guarantee a browser will necessarily check these conditions for an element in the following order:

  1. Is the pointer over this element?
  2. Is this element the 5th child of its parent?
  3. Does this element have a data-foo attribute with the value bar?
  4. Does this element have a name class?
  5. Is this a div element?

Nor would this selector, which is functionally identical to the above except with its simple selectors jumbled around, necessarily be evaluated in the following order:

div:hover[data-foo="bar"].name:nth-child(5)::after
  1. Is this element the 5th child of its parent?
  2. Does this element have a name class?
  3. Does this element have a data-foo attribute with the value bar?
  4. Is the pointer over this element?
  5. Is this a div element?

There is simply no reason that such an order would be enforced for performance reasons. In fact, I'd think that performance would be enhanced by picking at certain kinds of simple selectors first, no matter where they are in a sequence. (You'll also notice that the ::after is not accounted for — that's because pseudo-elements are not simple selectors and never even enter into the matching equation.)

For example, it's very well-known that ID selectors are the fastest. Well, Boris says this in the last paragraph of his answer to the linked question:

Note also that there are other optimizations browsers already do to avoid even trying to match rules that definitely won't match. For example, if the rightmost selector has an id and that id doesn't match the element's id, then there will be no attempt to match that selector against that element at all in Gecko: the set of "selectors with IDs" that are attempted comes from a hashtable lookup on the element's ID. So this is 70% of the rules which have a pretty good chance of matching that still don't match after considering just the tag/class/id of the rightmost selector.

In other words, whether you have a selector that looks like this:

div#foo.bar:first-child

Or this:

div.bar:first-child#foo

Gecko will always check the ID and the class first, regardless of where it is positioned in the sequence. If the element doesn't have an ID and a class that matches the selector then it's instantly discarded. Pretty darn quick if you ask me.

That was just Gecko as an example. This may differ between implementations as well (e.g. Gecko and WebKit may do it differently from Trident or even Presto). There are strategies and approaches that are generally agreed upon by vendors, of course (there isn't likely to be a difference in checking IDs first), but the little details may differ.

Why do browsers match CSS selectors from right to left?

Keep in mind that when a browser is doing selector matching it has one element (the one it's trying to determine style for) and all your rules and their selectors and it needs to find which rules match the element. This is different from the usual jQuery thing, say, where you only have one selector and you need to find all the elements that match that selector.

If you only had one selector and only one element to compare against that selector, then left-to-right makes more sense in some cases. But that's decidedly not the browser's situation. The browser is trying to render Gmail or whatever and has the one <span> it's trying to style and the 10,000+ rules Gmail puts in its stylesheet (I'm not making that number up).

In particular, in the situation the browser is looking at most of the selectors it's considering don't match the element in question. So the problem becomes one of deciding that a selector doesn't match as fast as possible; if that requires a bit of extra work in the cases that do match you still win due to all the work you save in the cases that don't match.

If you start by just matching the rightmost part of the selector against your element, then chances are it won't match and you're done. If it does match, you have to do more work, but only proportional to your tree depth, which is not that big in most cases.

On the other hand, if you start by matching the leftmost part of the selector... what do you match it against? You have to start walking the DOM, looking for nodes that might match it. Just discovering that there's nothing matching that leftmost part might take a while.

So browsers match from the right; it gives an obvious starting point and lets you get rid of most of the candidate selectors very quickly. You can see some data at http://groups.google.com/group/mozilla.dev.tech.layout/browse_thread/thread/b185e455a0b3562a/7db34de545c17665 (though the notation is confusing), but the upshot is that for Gmail in particular two years ago, for 70% of the (rule, element) pairs you could decide that the rule does not match after just examining the tag/class/id parts of the rightmost selector for the rule. The corresponding number for Mozilla's pageload performance test suite was 72%. So it's really worth trying to get rid of those 2/3 of all rules as fast as you can and then only worry about matching the remaining 1/3.

Note also that there are other optimizations browsers already do to avoid even trying to match rules that definitely won't match. For example, if the rightmost selector has an id and that id doesn't match the element's id, then there will be no attempt to match that selector against that element at all in Gecko: the set of "selectors with IDs" that are attempted comes from a hashtable lookup on the element's ID. So this is 70% of the rules which have a pretty good chance of matching that still don't match after considering just the tag/class/id of the rightmost selector.

CSS combinator precedence?

No, there is no notion of precedence in combinators. However, there is a notion of order of elements in a complex selector.

Any complex selector can be read in any direction that makes sense to you, but this does not imply that combinators are distributive or commutative, as they indicate a relationship between two elements, e.g. ancestor descendant and previous + next. This is why the order of elements is what matters.

According to Google, however, browsers implement their selector engines such that they evaluate complex selectors from right to left:

The engine [Gecko] evaluates each rule from right to left, starting from the rightmost selector (called the "key") and moving through each selector until it finds a match or discards the rule.

Mozilla's article, Writing Efficient CSS for use in the Mozilla UI has a section that describes how their CSS engine evaluates selectors. This is XUL-specific, but the same layout engine is used both for Firefox's UI and pages that display in Firefox's viewport. (dead link)

As described by Google in the above quote, the key selector simply refers to the right-most simple selector sequence, so again it's from right to left:

The style system matches rules by starting with the key selector, then moving to the left (looking for any ancestors in the rule’s selector). As long as the selector’s subtree continues to check out, the style system continues moving to the left until it either matches the rule, or abandons because of a mismatch.

Bear in mind two things:

  • These are documented based on implementation details; at heart, a selector is a selector, and all it is intended to do is to match an element that satisfies a certain condition (laid out by the components of the selector). In which direction it is read is up to the implementation; as pointed out by another answer, the spec does not say anything about what order to evaluate a selector in or about combinator precedence.

  • Neither article implies that each simple selector is evaluated from left to right within its simple selector sequence (see this answer for why I believe this isn't the case). What the articles are saying is that a browser engine will evaluate the key selector sequence to figure out if its working DOM element matches it, then if it does, progress onto the next selector sequence by following the combinator and check for any elements that match that sequence, then rinse and repeat until either completion or failure.


With all that said, if you were to ask me to read selectors and describe what they select in plain English, I would read them from right to left too (not that I'm certain whether this is relevant to implementation details though!).

So, the selector:

a > b ~ c d

would mean:

Select any d element

that is a descendant of a c element

that is a sibling of, and comes after, a b element

that is a child (direct descendant) of an a element.

Tag name in CSS selector (e.g. div#id): how is it read? (Left to right or right to left?)

Ok, I think you've gotten a little confused.

In your example, you use:

div table a

So i'll use that.

Pretty much, that could look like this in your html

<div>
<table>
<a>
//styling applied here
</a>
</table>
</div>

or something else like

<div>
<div></div>
<table>
<tr>
<th>hi there</th>
<th>
<a>i'm an a tag!</a>

So, looking at that:

div table a

will be

div table a

^ ^ ^
| | |
| | a child
| |
| parent
|
grandparent

This means that you'll be styling any 'a' element that is a child/descendant of a table, which, in turn, is a descendant of a div element

so, in your other example:

div#div_id

you would be styling all id's of div_id in which have a div as a parent.


BTW looking at your example, I would like to point out that (in case you didn't know):

  • the id attribute should be unique
  • an <a> attribute shouldn't be used directly within a <table> element (instead nest it within a th or td tag)
  • If you wish to style multiple elements (of varying types), it would be more efficient to create a class, and use that instead


Answer after Clarification:


Your

 div#div_id

In HTML, since the id is meant to be unique, it will look up 'all id's' with the specified id.

It will then check if it is a div element.

This seems to be a bad example, as obviously some (older) browsers will only look for the first id, and return it instead of checking the whole webpage for any 'duplicate' id's.

With your id's being unique, you could then drop your tag as it will be left redundant/ no use



Summary


So, an example of this extended conversation in the comments:

if I wanted to style a single div (and still know it was a div that i was adding styling to), i would use the naming convention of:

<div id="my-div-to-style">
^
|

[the word 'div' here could be anything]

in my css i would write:

       _  this word must match the
/ id i used above
|
#my-div-to-style{
//styling...
}

If i wanted to add the same styling to multiple div elements (with the scope to add it to others), i would instead use a class:

<div class="myDivStyle">

and then use:

.myDivStyle{
//styling...
}

in this last example, I would not be restricted to just styling divs, so i wouldn't include this in my naming:

<div class="myStyle">
<a class="myStyle">
<table class="myStyle">



.myStyle{
//styling for any element I want
}

CSS Performance with Compound Selectors and Right-to-Left Parsing

Compound selectors are not necessarily evaluated in any specific order. For example, most if not all implementations optimize for ID, class and type selectors to match fast or fail fast (at least Gecko does according to Boris Zbarsky), then evaluate attribute selectors and pseudo-classes as necessary.

It's not feasible to predict how exactly any given browser, let alone all of them, will evaluate a compound selector, let alone each compound selector in a complex selector containing more than one, but what we do know is that right-to-left matching starts from the rightmost compound selector and steps leftward until matching fails.

It's important to note that this is merely an implementation detail that's agreed upon by vendors — you could implement selector matching however you like, but so long as you match the right elements with the right selectors, your implementation will be standards-compliant.

But what's most important is that, in the real world, none of this is likely to matter. Write selectors that are readable and meaningful, don't unnecessarily overqualify them, avoid specificity hacks where possible, and you should be good.

Is a dynamic pseudo-class evaluated before the rest of the selector?

Unfortunately,

* => :hover => div

The universal selector is evaluated first, which means it looks at every element in the DOM, then checks to see if it's in a :hover state. Finally, for any matching elements, it then checks for a parent div.

Choosing efficient selectors based on computational complexity

At runtime an HTML document is parsed into a DOM tree containing N elements with an average depth D. There is also a total of S CSS rules in the stylesheets applied.

  1. Elements' styles are applied individually meaning there is a direct relationship between N and overall complexity. Worth noting, this can be somewhat offset by browser logic such as reference caching and recycling styles from identical elements. For instance, the following list items will have the same CSS properties applied (assuming no pseudo-classes such as :nth-child are applied):

    <ul class="sample">
    <li>one</li>
    <li>two</li>
    <li>three</li>
    </ul>
  2. Selectors are matched right-to-left for individual rule eligibility - i.e. if the right-most key does not match a particular element, there is no need to further process the selector and it is discarded. This means that the right-most key should match as few elements as possible. Below, the p descriptor will match more elements including paragraphs outside of target container (which, of course, will not have the rule apply but will still result in more iterations of eligibility checking for that particular selector):

    .custom-container p {}
    .container .custom-paragraph {}
  3. Relationship selectors: descendant selector requires for up to D elements to be iterated over. For instance, successfully matching .container .content may only require one step should the elements be in a parent-child relationship, but the DOM tree will need to be traversed all the way up to html before an element can be confirmed a mismatch and the rule safely discarded. This applies to chained descendant selectors as well, with some allowances.

    On the other hand, a > child selector, an + adjacent selector or :first-child still require an additional element to be evaluated but only have an implied depth of one and will never require further tree traversal.

  4. The behavior definition of pseudo-elements such as :before and :after implies they are not part of the RTL paradigm. The logic the assumption is that there is no pseudo element per se until a rule instructs for it to be inserted before or after an element's content (which in turn requires extra DOM manipulation but there is no additional computation required to match the selector itself).

  5. I couldn't find any information on pseudo-classes such as :nth-child() or :disabled. Verifying an element state would require additional computation, but from the rule parsing perspective it would only make sense for them to be excluded from RTL processing.

Given these relationships, computational complexity O(N*D*S) should be lowered primarily by minimizing the depth of CSS selectors and addressing point 2 above. This will result in quantifiably stronger improvements when compared to minimizing the number of CSS rules or HTML elements alone^

Shallow, preferably one-level, specific selectors are processed faster. This is taken to a whole new level by Google (programmatically, not by hand!), for instance there is rarely a three-key selector and most of the rules in search results look like

#gb {}
#gbz, #gbg {}
#gbz {}
#gbg {}
#gbs {}
.gbto #gbs {}
#gbx3, #gbx4 {}
#gbx3 {}
#gbx4 {}
/*...*/

^ - while this is true from a rendering engine performance standpoint, there are always additional factors such as traffic overhead and DOM parsing etc.

Sources: 1 2 3 4 5

Ordering simple selectors in a sequence

5.2 Selector syntax

A simple selector is either a type selector or universal selector
followed immediately by zero or more attribute selectors, ID
selectors, or pseudo-classes, in any order. The simple selector
matches if all of its components match.

Note: the terminology used here in CSS 2.1 is different from what is
used in CSS3. For example, a "simple selector" refers to a smaller
part of a selector in CSS3 than in CSS 2.1. See the CSS3 Selectors
module [CSS3SEL].

A selector is a chain of one or more simple selectors separated by
combinators. Combinators are: white space, ">", and "+". White space
may appear between a combinator and the simple selectors around it.

The elements of the document tree that match a selector are called
subjects of the selector. A selector consisting of a single simple
selector matches any element satisfying its requirements. Prepending a
simple selector and combinator to a chain imposes additional matching
constraints, so the subjects of a selector are always a subset of the
elements matching the last simple selector.

One pseudo-element may be appended to the last simple selector in a
chain, in which case the style information applies to a subpart of
each subject.

From: w3.org



Related Topics



Leave a reply



Submit