YAML as a JSON superset and TAB characters

I am unable to find a reference to this error exactly, but YAML 1.2 says it's a JSON superset, and if I use tab characters in a JSON it treats it like an error.

eg

"root": {
        "key": "value"
}

(Online validation here says that 't' that cannot start any token )

I know why YAML historically disallows tabs, but how can I interpret this in the context of JSON-superset?

(eg Is YAML not an actual superset or does JSON also disallow tabs? Or the spec does allow for tabs in this case but the implementation is not there yet?)

Thanks.


Tabs ARE allowed in YAML, but only where indentation does not apply.

According to YAML 1.2 Section 5.5:

YAML recognizes two white space characters: space and tab.

The following examples will use · to denote spaces and to denote tabs. All examples can be validated using the official YAML Reference Parser.

YAML has a block style and flow style. In block style, indentation determines the structure of a document. The following document uses block style.

root:
··key: value

Validate

In flow style, special characters indicate the structure of the document. The following equivalent document uses flow style.

{
→ root: {
→ → key: value
→ }
}

Validate

You can even mix indentation in flow style.

{
→ root: {
··→ key: value
····}
}

Validate

If you're mixing block and flow style, the entire flow style part must respect the block style indentation.

root:
··{
····key: value
··}

Validate

But you can still mix your indentation within the flow style part.

root:
··{
··→ key: value
··}

Validate

If you have a single value document, you can surround the value with all manner of whitespace.

→ ··value··→ 

Validate

The point is, every JSON document that is parsed as YAML will put the document into flow style (because of the initial { or [ character) which supports tabs, unless it is a single value JSON document, in which case YAML still allows padding with whitespace.

If a YAML parser throws because of tabs in a JSON document, then it is not a valid parser.

That being said, your example is failing because a block style mapping value must always be indented if it's not on the same line as the mapping name.

root: {
··key: value
}

is not valid, however

root:
··{
····key: value
··}

is valid, and

root: { key: value }

is also valid.


I know why YAML historically disallows tabs, but how can I interpret this in the context of JSON-superset?

Taking the rest of the specifications into account, we can only conclude that the "superset" comment is inaccurate. The YAML specification is fundamentally inconsistent in the Relation to JSON section:

YAML can therefore be viewed as a natural superset of JSON, offering improved human readability and a more complete information model. This is also the case in practice; every JSON file is also a valid YAML file. This makes it easy to migrate from JSON to YAML if/when the additional features are required.

JSON's RFC4627 requires that mappings keys merely “SHOULD” be unique, while YAML insists they “MUST” be. Technically, YAML therefore complies with the JSON spec, choosing to treat duplicates as an error. In practice, since JSON is silent on the semantics of such duplicates, the only portable JSON files are those with unique keys, which are therefore valid YAML files.

Despite asserting YAML as a "natural superset of JSON" and stating that "every JSON file is also a valid YAML file", the spec immediately notes some differences regarding key uniqueness. Arguably, the spec should also note the differences around using tabs for indentation here as well.

Speaking of which, as the validator implied, YAML explicitly prohibits tabs as indentation characters:

To maintain portability, tab characters must not be used in indentation, since different systems treat tabs differently. Note that most modern editors may be configured so that pressing the tab key results in the insertion of an appropriate number of spaces.

This is, of course, stricter than the JSON specification, which simply states:

Whitespace can be inserted between any pair of tokens.

So, to directly answer your questions...

(eg Is YAML not an actual superset or does JSON also disallow tabs? Or the spec does allow for tabs in this case but the implementation is not there yet?)

...YAML is not actually a superset, JSON does not disallow tabs, whereas the YAML specification does indeed disallow tabs explicitly.


According to specification tabs were never allowed. So, when JSON is used inside YAML, it is not allowing tabs.

The problem occurs when we think JSON as a pure subset of YAML. But it is not, according to Relation to JSON section in specification, there are some little things, that keeps json from being a pure subset of YAML.

If we are to address those dissimilarities, what we will need is something like YSON, which is also mentioned in the spec.

But fortunately there are some YAML engines that support tabs as indentations. Snakeyml is an example for that.

链接地址: http://www.djcxy.com/p/37852.html

上一篇: 在JSON中,为什么每个名称都被引用?

下一篇: YAML作为JSON超集和TAB字符