This page describes JavaScript's lexical grammar. The source text of ECMAScript scripts gets scanned from left to right and is converted into a sequence of input elements which are tokens, control characters, line terminators, comments or white space. ECMAScript also defines certain keywords and literals and has rules for automatic insertion of semicolons to end statements.
Control characters have no visual representation but are used to control the interpretation of the text.
Code point | Name | Abbreviation | Description |
---|---|---|---|
U+200C | Zero width non-joiner | <ZWNJ> | Placed between characters to prevent being connected into ligatures in certain languages (Wikipedia). |
U+200D | Zero width joiner | <ZWJ> | Placed between characters that would not normally be connected in order to cause the characters to be rendered using their connected form in certain languages (Wikipedia). |
U+FEFF | Byte order mark | <BOM> | Used at the start of the script to mark it as Unicode and the text's byte order (Wikipedia). |
White space characters improve the readability of source text and separate tokens from each other. These characters are usually unnecessary for the functionality of the code. Minification tools are often used to remove whitespace in order to reduce the amount of data that needs to be transferred.
Code point | Name | Abbreviation | Description | Escape sequence |
---|---|---|---|---|
U+0009 | Character tabulation | <HT> | Horizontal tabulation | \t |
U+000B | Line tabulation | <VT> | Vertical tabulation | \v |
U+000C | Form feed | <FF> | Page breaking control character (Wikipedia). | \f |
U+0020 | Space | <SP> | Normal space | |
U+00A0 | No-break space | <NBSP> | Normal space, but no point at which a line may break | |
Others | Other Unicode space characters | <USP> | Spaces in Unicode on Wikipedia |
In addition to white space characters, line terminator characters are used to improve the readability of the source text. However, in some cases, line terminators can influence the execution of JavaScript code as there are a few places where they are forbidden. Line terminators also affect the process of automatic semicolon insertion. Line terminators are matched by the \s class in regular expressions.
Only the following Unicode code points are treated as line terminators in ECMAScript, other line breaking characters are treated as white space (for example, Next Line, NEL, U+0085 is considered as white space).
Code point | Name | Abbreviation | Description | Escape sequence |
---|---|---|---|---|
U+000A | Line Feed | <LF> | New line character in UNIX systems. | \n |
U+000D | Carriage Return | <CR> | New line character in Commodore and early Mac systems. | \r |
U+2028 | Line Separator | <LS> | Wikipedia | |
U+2029 | Paragraph Separator | <PS> | Wikipedia |
Comments are used to add hints, notes, suggestions, or warnings to JavaScript code. This can make it easier to read and understand. They can also be used to disable code to prevent it from being executed; this can be a valuable debugging tool.
JavaScript has two ways of assigning comments in its code.
The first way is the //
comment; this makes all text following it on the same line into a comment. For example:
function comment() { // This is a one line JavaScript comment console.log('Hello world!'); } comment();
The second way is the /* */
style, which is much more flexible.
For example, you can use it on a single line:
function comment() { /* This is a one line JavaScript comment */ console.log('Hello world!'); } comment();
You can also make multiple-line comments, like this:
function comment() { /* This comment spans multiple lines. Notice that we don't need to end the comment until we're done. */ console.log('Hello world!'); } comment();
You can also use it in the middle of a line, if you wish, although this can make your code harder to read so it should be used with caution:
function comment(x) { console.log('Hello ' + x /* insert the value of x */ + ' !'); } comment('world');
In addition, you can use it to disable code to prevent it from running, by wrapping code in a comment, like this:
function comment() { /* console.log('Hello world!'); */ } comment();
In this case, the console.log()
call is never issued, since it's inside a comment. Any number of lines of code can be disabled this way.
break
case
catch
class
const
continue
debugger
default
delete
do
else
export
extends
finally
for
function
if
import
in
instanceof
new
return
super
switch
this
throw
try
typeof
var
void
while
with
yield
The following are reserved as future keywords by the ECMAScript specification. They have no special functionality at present, but they might at some future time, so they cannot be used as identifiers.
These are always reserved:
enum
The following are only reserved when they are found in strict mode code:
implements
interface
let
package
private
protected
public
static
The following are only reserved when they are found in module code:
await
The following are reserved as future keywords by older ECMAScript specifications (ECMAScript 1 till 3).
abstract
boolean
byte
char
double
final
float
goto
int
long
native
short
synchronized
throws
transient
volatile
Additionally, the literals null
, true
, and false
cannot be used as identifiers in ECMAScript.
Reserved words actually only apply to Identifiers (vs. IdentifierNames
) . As described in es5.github.com/#A.1, these are all IdentifierNames
which do not exclude ReservedWords
.
a.import a['import'] a = { import: 'test' }.
On the other hand the following is illegal because it's an Identifier, which is an IdentifierName
without the reserved words. Identifiers are used for FunctionDeclaration, FunctionExpression, VariableDeclaration
and so on. IdentifierNames
are used for MemberExpression, CallExpression
and so on.
function import() {} // Illegal.
See also null
for more information.
null
See also Boolean
for more information.
true false
1234567890 42 // Caution when using with a leading zero: 0888 // 888 parsed as decimal 0777 // parsed as octal, 511 in decimal
Note that decimal literals can start with a zero (0
) followed by another decimal digit, but If all digits after the leading 0
are smaller than 8, the number is interpreted as an octal number. This won't throw in JavaScript, see bug 957513. See also the page about parseInt()
.
Binary number syntax uses a leading zero followed by a lowercase or uppercase Latin letter "B" (0b
or 0B
). Because this syntax is new in ECMAScript 2015, see the browser compatibility table, below. If the digits after the 0b
are not 0 or 1, the following SyntaxError
is thrown: "Missing binary digits after 0b".
var FLT_SIGNBIT = 0b10000000000000000000000000000000; // 2147483648 var FLT_EXPONENT = 0b01111111100000000000000000000000; // 2139095040 var FLT_MANTISSA = 0B00000000011111111111111111111111; // 8388607
Octal number syntax uses a leading zero followed by a lowercase or uppercase Latin letter "O" (0o
or 0O)
. Because this syntax is new in ECMAScript 2015, see the browser compatibility table, below. If the digits after the 0o
are outside the range (01234567), the following SyntaxError
is thrown: "Missing octal digits after 0o".
var n = 0O755; // 493 var m = 0o644; // 420 // Also possible with just a leading zero (see note about decimals above) 0755 0644
Hexadecimal number syntax uses a leading zero followed by a lowercase or uppercase Latin letter "X" (0x
or 0X)
. If the digits after 0x are outside the range (0123456789ABCDEF), the following SyntaxError
is thrown: "Identifier starts immediately after numeric literal".
0xFFFFFFFFFFFFFFFFF // 295147905179352830000 0x123456789ABCDEF // 81985529216486900 0XA // 10
See also Object
and Object initializer for more information.
var o = { a: 'foo', b: 'bar', c: 42 }; // shorthand notation. New in ES2015 var a = 'foo', b = 'bar', c = 42; var o = {a, b, c}; // instead of var o = { a: a, b: b, c: c };
See also Array
for more information.
[1954, 1974, 1990, 2014]
A string literal is zero or more Unicode code points enclosed in single or double quotes. Unicode code points may also be represented by an escape sequence. All code points may appear literally in a string literal except for these closing quote code points:
Prior to the proposal to make all JSON text valid ECMA-262, U+2028 <LS> and U+2029 <PS>, were also disallowed from appearing unescaped in string literals.
Any code points may appear in the form of an escape sequence. String literals evaluate to ECMAScript String values. When generating these String values Unicode code points are UTF-16 encoded.
'foo' "bar"
Hexadecimal escape sequences consist of \x
followed by exactly two hexadecimal digits representing a code unit or code point in the range 0x0000 to 0x00FF.
'\xA9' // "©"
A Unicode escape sequence consists of exactly four hexadecimal digits following \u
. It represents a code unit in the UTF-16 encoding. For code points U+0000 to U+FFFF, the code unit is equal to the code point. Code points U+10000 to U+10FFFF require two escape sequences representing the two code units (a surrogate pair) used to encode the character; the surrogate pair is distinct from the code point.
See also String.fromCharCode()
and String.prototype.charCodeAt()
.
'\u00A9' // "©" (U+A9)
A Unicode code point escape consists of \u{
, followed by a code point in hexadecimal base, followed by }
. The value of the hexadecimal digits must be in the range 0 and 0x10FFFF inclusive. Code points in the range U+10000 to U+10FFFF do not need to be represented as a surrogate pair. Code point escapes were added to JavaScript in ECMAScript 2015 (ES6).
See also String.fromCodePoint()
and String.prototype.codePointAt()
.
'\u{2F804}' // CJK COMPATIBILITY IDEOGRAPH-2F804 (U+2F804) // the same character represented as a surrogate pair '\uD87E\uDC04'
See also RegExp
for more information.
/ab+c/g // An "empty" regular expression literal // The empty non-capturing group is necessary // to avoid ambiguity with single-line comments. /(?:)/
See also template strings for more information.
`string text` `string text line 1 string text line 2` `string text ${expression} string text` tag `string text ${expression} string text`
Some JavaScript statements must be terminated with semicolons and are therefore affected by automatic semicolon insertion (ASI):
let
, const
, variable statementimport
, export
, module declarationdebugger
continue
, break
, throw
return
The ECMAScript specification mentions three rules of semicolon insertion.
1. A semicolon is inserted before, when a Line terminator or "}" is encountered that is not allowed by the grammar.
{ 1 2 } 3 // is transformed by ASI into { 1 2 ;} 3;
2. A semicolon is inserted at the end, when the end of the input stream of tokens is detected and the parser is unable to parse the single input stream as a complete program.
Here ++
is not treated as a postfix operator applying to variable b
, because a line terminator occurs between b
and ++
.
a = b ++c // is transformend by ASI into a = b; ++c;
3. A semicolon is inserted at the end, when a statement with restricted productions in the grammar is followed by a line terminator. These statements with "no LineTerminator here" rules are:
++
and --
)continue
break
return
yield
, yield*
module
return a + b // is transformed by ASI into return; a + b;
Specification | Status | Comment |
---|---|---|
ECMAScript 1st Edition (ECMA-262) | Standard | Initial definition. |
ECMAScript 5.1 (ECMA-262) The definition of 'Lexical Conventions' in that specification. | Standard | |
ECMAScript 2015 (6th Edition, ECMA-262) The definition of 'Lexical Grammar' in that specification. | Standard | Added: Binary and Octal Numeric literals, Unicode code point escapes, Templates |
ECMAScript Latest Draft (ECMA-262) The definition of 'Lexical Grammar' in that specification. | Draft |
Desktop | ||||||
---|---|---|---|---|---|---|
Chrome | Edge | Firefox | Internet Explorer | Opera | Safari | |
Array literals ([1, 2, 3] ) |
Yes | Yes | 1 | Yes | Yes | Yes |
Binary numeric literals (0b ) |
41 | 12 | 25 | No | 28 | 9 |
Boolean literals (true /false ) |
Yes | Yes | 1 | Yes | Yes | Yes |
Decimal numeric literals (1234567890 ) |
Yes | Yes | 1 | Yes | Yes | Yes |
Hexadecimal escape sequences ('\0xA9' ) |
Yes | Yes | 1 | Yes | Yes | Yes |
Hexadecimal numeric literals (0xAF ) |
Yes | Yes | 1 | Yes | Yes | Yes |
Null literal (null ) |
Yes | Yes | 1 | Yes | Yes | Yes |
Octal numeric literals (0o ) |
41 | 12 | 25 | Yes | 28 | 9 |
Regular expression literals (/ab+c/g ) |
Yes | Yes | 1 | Yes | Yes | Yes |
String literals ('Hello world' ) |
Yes | Yes | 1 | Yes | Yes | Yes |
Unicode escape sequences ('\u00A9' ) |
Yes | Yes | 1 | Yes | Yes | Yes |
Unicode point escapes (\u{} ) |
44 | 12 | 40 | No | 31 | 9 |
Shorthand notation for object literals | 43 | 12 | 33 | No | 30 | 9 |
Template literals | 41 | 12 | 34 | No | 28 | 9 |
Trailing commas | Yes | Yes | 1 | Yes | Yes | Yes |
Mobile | |||||||
---|---|---|---|---|---|---|---|
Android webview | Chrome for Android | Edge Mobile | Firefox for Android | Opera for Android | iOS Safari | Samsung Internet | |
Array literals ([1, 2, 3] ) |
Yes | Yes | Yes | 4 | Yes | Yes | Yes |
Binary numeric literals (0b ) |
41 | 41 | 12 | 25 | 28 | Yes | 4.0 |
Boolean literals (true /false ) |
Yes | Yes | Yes | 4 | Yes | Yes | Yes |
Decimal numeric literals (1234567890 ) |
Yes | Yes | Yes | 4 | Yes | Yes | Yes |
Hexadecimal escape sequences ('\0xA9' ) |
Yes | Yes | Yes | 4 | Yes | Yes | Yes |
Hexadecimal numeric literals (0xAF ) |
Yes | Yes | Yes | 4 | Yes | Yes | Yes |
Null literal (null ) |
Yes | Yes | Yes | 4 | Yes | Yes | Yes |
Octal numeric literals (0o ) |
41 | 41 | 12 | 25 | 28 | Yes | 4.0 |
Regular expression literals (/ab+c/g ) |
Yes | Yes | Yes | 4 | Yes | Yes | Yes |
String literals ('Hello world' ) |
Yes | Yes | Yes | 4 | Yes | Yes | Yes |
Unicode escape sequences ('\u00A9' ) |
Yes | Yes | Yes | 4 | Yes | Yes | Yes |
Unicode point escapes (\u{} ) |
44 | 44 | 12 | 40 | 31 | ? | 4.0 |
Shorthand notation for object literals | 43 | 43 | 12 | 33 | 30 | ? | 4.0 |
Template literals | 41 | 41 | 12 | 34 | 28 | ? | 4.0 |
Trailing commas | Yes | Yes | Yes | 4 | Yes | Yes | Yes |
Server | |
---|---|
Node.js | |
Array literals ([1, 2, 3] ) |
Yes |
Binary numeric literals (0b ) |
4.0.0
|
Boolean literals (true /false ) |
Yes |
Decimal numeric literals (1234567890 ) |
Yes |
Hexadecimal escape sequences ('\0xA9' ) |
Yes |
Hexadecimal numeric literals (0xAF ) |
Yes |
Null literal (null ) |
Yes |
Octal numeric literals (0o ) |
Yes |
Regular expression literals (/ab+c/g ) |
Yes |
String literals ('Hello world' ) |
Yes |
Unicode escape sequences ('\u00A9' ) |
Yes |
Unicode point escapes (\u{} ) |
Yes |
Shorthand notation for object literals | Yes |
Template literals | 4.0.0 |
Trailing commas | Yes |
Boolean
Number
RegExp
String
© 2005–2018 Mozilla Developer Network and individual contributors.
Licensed under the Creative Commons Attribution-ShareAlike License v2.5 or later.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Lexical_grammar