Proposal proposal-json-superset

Stage 4 Draft / May 22, 2018

Subsume JSON

1String Literals

Note 1

A string literal is zero or more Unicode code points enclosed in single or double quotes. Unicode code points may also be represented by an escape sequence. All code points may appear literally in a string literal except for the closing quote code points, U+005C (REVERSE SOLIDUS), U+000D (CARRIAGE RETURN), U+2028 (LINE SEPARATOR), U+2029 (PARAGRAPH SEPARATOR), and U+000A (LINE FEED). Any code points may appear in the form of an escape sequence. String literals evaluate to ECMAScript String values. When generating these String values Unicode code points are UTF-16 encoded as defined in 10.1.1. Code points belonging to the Basic Multilingual Plane are encoded as a single code unit element of the string. All other code points are encoded as two code unit elements of the string.

Syntax

StringLiteral::"DoubleStringCharactersopt" 'SingleStringCharactersopt' DoubleStringCharacters::DoubleStringCharacterDoubleStringCharactersopt SingleStringCharacters::SingleStringCharacterSingleStringCharactersopt DoubleStringCharacter::SourceCharacterbut not one of " or \ or LineTerminator <LS> <PS> \EscapeSequence LineContinuation SingleStringCharacter::SourceCharacterbut not one of ' or \ or LineTerminator <LS> <PS> \EscapeSequence LineContinuation LineContinuation::\LineTerminatorSequence EscapeSequence::CharacterEscapeSequence 0[lookahead ∉ DecimalDigit] HexEscapeSequence UnicodeEscapeSequence

A conforming implementation, when processing strict mode code, must not extend the syntax of EscapeSequence to include LegacyOctalEscapeSequence as described in B.1.2.

CharacterEscapeSequence::SingleEscapeCharacter NonEscapeCharacter SingleEscapeCharacter::one of'"\bfnrtv NonEscapeCharacter::SourceCharacterbut not one of EscapeCharacter or LineTerminator EscapeCharacter::SingleEscapeCharacter DecimalDigit x u HexEscapeSequence::xHexDigitHexDigit UnicodeEscapeSequence::uHex4Digits u{HexDigits} Hex4Digits::HexDigitHexDigitHexDigitHexDigit

The definition of the nonterminal HexDigit is given in 11.8.3. SourceCharacter is defined in 10.1.

Note 2

A line terminator code point<LF> and <CR> cannot appear in a string literal, except as part of a LineContinuation to produce the empty code points sequence. The proper way to cause a line terminator code point to be part ofinclude either in the String value of a string literal is to use an escape sequence such as \n or \u000A.

1.1Static Semantics: Early Errors

UnicodeEscapeSequence::u{HexDigits}
  • It is a Syntax Error if the MV of HexDigits > 0x10FFFF.

1.2Static Semantics: StringValue

StringLiteral::"DoubleStringCharactersopt" 'SingleStringCharactersopt'
  1. Return the String value whose elements are the SV of this StringLiteral.

1.3Static Semantics: SV

A string literal stands for a value of the String type. The String value (SV) of the literal is described in terms of code unit values contributed by the various parts of the string literal. As part of this process, some Unicode code points within the string literal are interpreted as having a mathematical value (MV), as described below or in 11.8.3.

Table 1: String Single Character Escape Sequences
Escape Sequence Code Unit Value Unicode Character Name Symbol
\b 0x0008 BACKSPACE <BS>
\t 0x0009 CHARACTER TABULATION <HT>
\n 0x000A LINE FEED (LF) <LF>
\v 0x000B LINE TABULATION <VT>
\f 0x000C FORM FEED (FF) <FF>
\r 0x000D CARRIAGE RETURN (CR) <CR>
\" 0x0022 QUOTATION MARK "
\' 0x0027 APOSTROPHE '
\\ 0x005C REVERSE SOLIDUS \

2JSON.parse ( text [ , reviver ] )

The parse function parses a JSON text (a JSON-formatted String) and produces an ECMAScript value. The JSON format represents literals, arrays, and objects with a syntax similar to the syntax for ECMAScript literals, Array Initializers, and Object Initializers. After parsing, JSON objects are realized as ECMAScript objects. JSON arrays are realized as ECMAScript Array instances. JSON strings, numbers, booleans, and null are realized as ECMAScript Strings, Numbers, Booleans, and null.

The optional reviver parameter is a function that takes two parameters, key and value. It can filter and transform the results. It is called with each of the key/value pairs produced by the parse, and its return value is used instead of the original value. If it returns what it received, the structure is not modified. If it returns undefined then the property is deleted from the result.

  1. Let JText be ? ToString(text).
  2. Parse JText interpreted as UTF-16 encoded Unicode points (6.1.4) as a JSON text as specified in ECMA-404. Throw a SyntaxError exception if JText is not a valid JSON text as defined in that specification.
  3. Let scriptText be the result of concatenating "(", JText, and ");".
  4. Let completion be the result of parsing and evaluating scriptText as if it was the source text of an ECMAScript Script, but using the alternative definition of DoubleStringCharacter provided below. The extended PropertyDefinitionEvaluation semantics defined in B.3.1 must not be used during the evaluation.
  5. Let unfiltered be completion.[[Value]].
  6. Assert: unfiltered is either a String, Number, Boolean, Null, or an Object that is defined by either an ArrayLiteral or an ObjectLiteral.
  7. If IsCallable(reviver) is true, then
    1. Let root be ObjectCreate(%ObjectPrototype%).
    2. Let rootName be the empty String.
    3. Let status be CreateDataProperty(root, rootName, unfiltered).
    4. Assert: status is true.
    5. Return ? InternalizeJSONProperty(root, rootName).
  8. Else,
    1. Return unfiltered.

The length property of the parse function is 2.

JSON allows Unicode code units 0x2028 (LINE SEPARATOR) and 0x2029 (PARAGRAPH SEPARATOR) to directly appear in String literals without using an escape sequence. This is enabled by using the following alternative definition of DoubleStringCharacter when parsing scriptText in step 4:

DoubleStringCharacter::SourceCharacterbut not one of " or \ or U+0000 through U+001F \EscapeSequence
Note

Valid JSON text is a subset of the ECMAScript PrimaryExpression syntax as modified by Step 4 above. Step 2 verifies that JText conforms to that subset, and step 6 verifies that that parsing and evaluation returns a value of an appropriate type.

ACopyright & Software License

Copyright Notice

© 2018 Richard Gibson

Software License

All Software contained in this document ("Software") is protected by copyright and is being made available under the "BSD License", included below. This Software may be subject to third party rights (rights from parties other than Ecma International), including patent rights, and no licenses under such third party rights are granted under this license even if the third party concerned is a member of Ecma International. SEE THE ECMA CODE OF CONDUCT IN PATENT MATTERS AVAILABLE AT http://www.ecma-international.org/memento/codeofconduct.htm FOR INFORMATION REGARDING THE LICENSING OF PATENT CLAIMS THAT ARE REQUIRED TO IMPLEMENT ECMA INTERNATIONAL STANDARDS.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
  2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
  3. Neither the name of the authors nor Ecma International may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE ECMA INTERNATIONAL "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL ECMA INTERNATIONAL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.