What Are the Rules For JavaScript'S Automatic Semicolon Insertion (Asi)

What are the rules for JavaScript's automatic semicolon insertion (ASI)?

First of all you should know which statements are affected by the automatic semicolon insertion (also known as ASI for brevity):

  • empty statement
  • var statement
  • expression statement
  • do-while statement
  • continue statement
  • break statement
  • return statement
  • throw statement

The concrete rules of ASI, are described in the specification §11.9.1 Rules of Automatic Semicolon Insertion

Three cases are described:

  1. When an offending token is encountered that is not allowed by the grammar, a semicolon is inserted before it if:
  • The token is separated from the previous token by at least one LineTerminator.
  • The token is }

e.g.:

    { 1
2 } 3

is transformed to

    { 1
;2 ;} 3;

The NumericLiteral 1 meets the first condition, the following token is a line terminator.

The 2 meets the second condition, the following token is }.


  1. When the end of the input stream of tokens is encountered and the parser is unable to parse the input token stream as a single complete Program, then a semicolon is automatically inserted at the end of the input stream.

e.g.:

    a = b
++c

is transformed to:

    a = b;
++c;

  1. This case occurs when a token is allowed by some production of the grammar, but the production is a restricted production, a semicolon is automatically inserted before the restricted token.

Restricted productions:

    UpdateExpression :
LeftHandSideExpression [no LineTerminator here] ++
LeftHandSideExpression [no LineTerminator here] --

ContinueStatement :
continue ;
continue [no LineTerminator here] LabelIdentifier ;

BreakStatement :
break ;
break [no LineTerminator here] LabelIdentifier ;

ReturnStatement :
return ;
return [no LineTerminator here] Expression ;

ThrowStatement :
throw [no LineTerminator here] Expression ;

ArrowFunction :
ArrowParameters [no LineTerminator here] => ConciseBody

YieldExpression :
yield [no LineTerminator here] * AssignmentExpression
yield [no LineTerminator here] AssignmentExpression

The classic example, with the ReturnStatement:

    return 
"something";

is transformed to

    return;
"something";

Javascript automatic semicolon insertion for do-while statements

I'm pretty sure that "case" added in ES2015 is only there to standardize rules which browsers had already implemented in order to be compatible with terribly-written (or weirdly minified) scripts. It wasn't exactly a new feature, so much as it was a tweak of the specification to be in line with what browsers were doing already.

For example, your snippet runs in IE11, which was released in 2013:

do {} while (false) var a = 42;console.log('no parse errors');

What are the criteria for automatic semicolon insertion?

The ECMA specification (ch. 7.9.1, page 26) states:

There are three basic rules of
semicolon insertion:

  1. When, as the program is parsed from left to right, a token (called the
    offending token) is encountered that
    is not allowed by any production of
    the grammar, then a semicolon is
    automatically inserted before the
    offending token if one or more of the
    following conditions is true:

    • The
    offending token is separated from the
    previous token by at least one
    LineTerminator.

    • The offending token
    is }.
  2. When, as the program is parsed from left to right, the end of the input
    stream of tokens is encountered and
    the parser is unable to parse the
    input token stream as a single
    complete ECMAScript Program, then a
    semicolon is automatically inserted at
    the end of the input stream.
  3. When, as the program is parsed from left to right, a token is encountered
    that is allowed by some production of
    the grammar, but the production is a
    restricted production and the token
    would be the first token for a
    terminal or nonterminal immediately
    following the annotation “[no
    LineTerminator here]” within the
    restricted production (and therefore
    such a token is called a restricted
    token), and the restricted token is
    separated from the previous token by
    at least one LineTerminator, then a
    semicolon is automatically inserted
    before the restricted token.

I think this implementation has to do with the second point where:

var x = 1 + 2
-3 + 3 == 0 ? alert('0') : alert('3')

can be parsed as a single complete ECMAScript Program

Because it's not always clear how the parser will insert semi-colons, it's advisable to not leave it to the parser (i.e. always insert the semi-colons yourself).

In ch. 7.9.2 (Examples of Automatic Semicolon Insertion) of the same specs this example looks like your situation:

The source

a = b + c  
(d + e).print()

is not transformed by automatic
semicolon insertion, because the
parenthesised expression that begins
the second line can be interpreted as
an argument list for a function call:

a = b + c(d + e).print()

Automatic Semicolon Insertion (ASI) for arrow function expression vs normal function expression

The specification says:

When, as the source text is parsed from left to right, a token (called the offending token) is encountered that is not allowed by any production of the grammar, then a semicolon is automatically inserted before the offending token if one or more of the following conditions is true: [...]

So for ASI to happen, the syntax may be an invalid production without it. And actually it is:

  () => {}[0] // syntax error

So the Semicolon is needed here. Now why is that a Syntax error, while function () { }[0] is not? Well, the member access is defined as

  MemberExpression:
MemberExpression [ Expression ]
PrimaryExpression
//...

PrimaryExpression:
FunctionExpression
// no arrow function here

Why the Syntax is how it is? I don't know.

Automatic semicolon insertion & return statements

The javascript interpreter/compiler is so smart to only insert automatic semicolons if afterwards there is valid Javascript.

Your code works, because && b as it stands is no valid expression - that's why no semicolon gets inserted after the return a resulting in:

return a && b && c;

However:

return (undefined);//implicitely inserted
{
....
}

is perfectly valid and thats why a semicolon gets inserted.

For completeness' sake the ref to the spec: automatic semicolon insertion. THe examples are worth reading through.

Why was automatic semi-colon insertion (ASI) added to javascript?

Great question!

Brenden Eich designed the JavaScript programming language originally, and I think it is fair to say that he would agree that automatic semicolon insertion is a design flaw in the language.

We shouldn't blame him. He designed the language in a period of just 10 days in 1995, having no idea that 20 years later it would become (probably) the most important computer language on the planet.

In the following post he says "I wish I had made newlines more significant in JS back in those ten days in May, 1995."

https://brendaneich.com/2012/04/the-infernal-semicolon/

Read on... :)

Are there semicolon insertion dangers with continuing operators on next line?

If you take some syntactically valid line and punctuate it with line breaks, automatic semicolon insertion will not apply (except in the narrow case of return, throw and very few other statements, listed below). ASI only occurs when there is absolutely no other way to interpret the code. Certainly, there is a way to interpret your multiline code as a single statement, because it is valid as a single line. In short, ASI is generally a tool of last resort in the parser's attempt to understand the program.

To cite ES5, the first case of ASI detailed in the spec occurs...

  1. When, as the program is parsed from left to right, a token (called the offending token) is encountered that is not allowed by any production of the grammar...

But that case is naturally eliminated, because you had a grammatically valid line before you injected a newline into it. Thus, this case of ASI cannot apply to your case because it depends upon a span of code that is not syntactically valid without semicolons. You don't have that here.

(The other two cases don't apply either; the second case applies to the end of a program and the third case applies to continue, break, return, throw, and postfix ++/-- operators.)

The common problem people have with ASI occurs when an author has two lines which he expects will stand separately, but those two lines happen to cause no grammatical problem when understood as a single line. That case starts with two lines and they accidentally become one. Your cases is the inverse: you start with one line; it does not accidentally become two.



Related Topics



Leave a reply



Submit