Internet-Draft JSON Schema extension to NTV data February 2024
THOMY Expires 4 August 2024 [Page]
Workgroup:
Internet Engineering Task Force
Internet-Draft:
draft-thomy-ntv-schema-00
Published:
Intended Status:
Informational
Expires:
Author:
P. THOMY
Loco-labs

JSON Schema extension to NTV data

Abstract

The NTV format is an extension of the JSON format integrating a semantic dimension through the notion of type. This format remains compatible with the current JSON format but it is relevant to examine its compatibility and its impacts with data schemas. This document provides some answers to this question and presents some of the possible developments based mainly on the example of JSON Schema and additionally on the example of OpenAPI.

Note to Readers

This document is a working document and not a specification document. The developments and principles presented have been validated by a Python implementation based on the jsonschema module ( https:https://nbviewer.org/github/loco-philippe/NTV/blob/main/RFC/example_schema.ipynb)

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 4 August 2024.

Table of Contents

1. Introduction - Conclusion

JSON format primitives include a low semantic level (string, number, boolean, null).

To represent information with a high semantic level two mechanisms are used:

JSON data schemas are a particular form of representation of encoding/decoding.

The analysis and validation by prototyping of the developments presented in this document show us that:

[NTV_SCHEMA] contains the python implementation of the examples presented in the document (examples 1 to 6) as well as more complete examples (examples 7, 8 and 9)

2. Conventions Used in This Document

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

This document also uses the following terms:

JsonText, JsonValue, JsonObject, JsonMember, JsonElement, JsonArray, JsonNumber, JsonString, JsonFalse, JsonNull, JsonTrue:
These terms are defined in [JSON_NTV].
NTV, NTVlist, NTVsingle, NTVname, NTVtype, NTVvalue, JsonNTVname, ntv-pointer:
These terms are defined in [JSON_NTV].
json-pointer:
This term is defined in [RFC6901].

3. Presentation

3.1. NTV

The NTV structure [JSON_NTV] consists of representing data with three attributes: a name (NTVname: json-string), a value (NTVvalue: json-value) and a type (NTVtype: json-string).

The json representation is obtained by grouping the NTVname and NTVtype (JsonNTVname) and associating it with NTVvalue in a key/value form. e.g.

{"firstname:string": "peter"}

NTV distinguishes between two types of entity:

The NTVtype can be simple (eg 'int') or structured from nested namespaces (eg 'org.Person.givenName' for a type defined by Schema.org).

An NTV structure is therefore a tree composed of inner-nodes (NTVlist) and leaf-nodes (NTVsingle). Each node of this structure can be uniquely identified by a name (JsonNTVname or NTVname if it is unique) or by its index in the parent node. This identification can be carried out by an ntv-pointer identical to the json-pointer in the majority of cases.

For example ([NTV_SCHEMA] - Example 1),

{"family": "doe", "childrens age": [15, 24, 12] }

is a JSON representation of the NTV structure defined in Table 1.

Table 1: NTV structure
node NTVname NTVtype NTVvalue ntv-pointer
NTVlist None None list of 2 nodes empty string
NTVsingle family json doe /family
NTVlist childrens age None list of 3 nodes /childrens age
NTVsingle None json 15 /childrens age/0
NTVsingle None json 24 /childrens age/1
NTVsingle None json 12 /childrens age/2

Note:

3.2. JSON Schema

JSON Schema [JSON_SCHEMA] is one of the best-known tools for structuring JSON data. It proposes :

A schema is a JsonObject (or a boolean) where JsonObject properties that are applied to the JSON data to control (instance) are called keywords.

A schema can be represented by a simple association between an instance and a set of constraints to respect:

3.3. OpenAPI

An OpenAPI document is a self-contained or composite resource which defines or describes an API or elements of an API.

OpenAPI specification [OAS] defines a set of objects identified by a keyword.

An object is made up of Fields which can be an OpenAPI object or a JSONvalue. A JSON Schema is one of the OpenAPI objects (keyword: "schema").

Unlike JSON Schema, keywords are interdependent and can only be used in the context of another keyword.

4. Applicability of schemas to NTV

4.1. Equivalence between JSON and NTV structures

Any JSON structure being an NTV structure, a first response is to consider that if the JSON representation of an NTV structure is valid for a schema then the NTV structure is valid for this schema.

However, this argument is not valid because there is no direct correspondence between an NTV structure and a JSON structure.

For example, the following instances have a different JSON structure and will therefore not be able to respond to the same JSON Schema:

{"number1": [1, 2]}
{"number2": {"val1": 1, "val2": 2}}

However, the NTV structure is identical for these two entities (an NTVlist entity composed of two NTVsingle entities).

4.2. Application of JSON Schema to NTV structure

This chapter presents how to transpose the principles of JSON Schema to an NTV structure.

4.2.1. Instance-schema correspondence

The identification of the instance to be validated is carried out by the json-pointer relating to the JsonObject or JsonArray which contains it (name for "properties" or index for "prefixItems"). This principle can be transposed into access to entities via ntv-pointer (JsonNTVname or index).

It is interesting to note that the json-pointer and the ntv-pointer are identical except for two cases:

  • case 1: named root structure
{"root": { "val1": 21, "pointed": "target"}}

    json-pointer: "/root/pointed"
    ntv-pointer:   "root/pointed"
  • case 2: JsonObject with a single member included in an JsonArray
[10, 20, {"pointed": 30}, 40]

    json-pointer: "/2/pointed"
    ntv-pointer:  "/2" or "/pointed"

4.2.2. Scope of keywords

For a JSON instance, a validation keyword applies to the value (JsonMember or JsonElement) or sometimes to the key of JsonMember (e.g. "propertyName").

This principle can be transposed to an NTV entity and the constraint expressed in the schema can apply to the NTVvalue or to the NTVname. The keywords which will not apply to the NTVvalue concern entities of type NTVlist and apply to the elements in Table 2.

Table 2: specific keywords
keyword element
patternProperties ntv-pointer
required ntv-pointer
propertyNames NTVname
additionalProperties NTVentity
unevaluatedProperties NTVentity
minProperties (max) NTVentity
items NTVentity
minItems (max) NTVentity
unevaluatedItems NTVentity
uniqueItems NTVentity
contains (min, max) NTVentity

4.2.3. Application of control keywords

For keywords applying to ntv-pointer, NTVname or NTVvalue elements, the application is identical to that defined for JSON Schema.

For keywords applying to NTVentities, the transposition is direct.

Appendix A presents the transposition of keywords.

4.2.4. Implementation

Applying these principles to an NTV structure makes it possible to apply a JSON Schema in an equivalent manner to JSON instance or to the corresponding NTV instance.

The prototyping carried out confirms this point.

On the other hand, this implementation does not take into account the validation with the same schema of the following instances. e.g.

{"number1": [1, 2]}
{"number2": {"val1": 1, "val2": 2}}

4.3. Adaptation of a JSON Schema to an NTV instance

In an NTV structure, an NTVlist entity can be represented by a JsonArray or by a JsonObject. For a schema to be fully applicable to an NTV structure, we must be able to apply the keywords "properties" and "items" equally to any type of NTVlist.

With the principles identified in the previous chapter, this usage is valid (an entity is identified by its pointer which can be name or index). Thus, the following NTV instances ([NTV_SCHEMA] - Example 2):

{"number1": [10, 20]}
{"number2": {"val1": 10, "val2": 20}}
{"number3": [ 10, {"val2": 20}]}

are valid for the following schema:

{"properties": {
    "1": {"minimum": 15}},
 "items": {
    "maximum": 30},
 "prefixItems": [
    {"maximum": 15}]}

However, this schema cannot be applied to a JSON instance because a JsonObject is not an ordered structure.

4.4. Extension to NTVtype and NTVname

Data typing is partially addressed in a JSON Schema (keyword "type" and "format").

Data naming is also partially addressed for JsonObjects (keyword "propertyName").

For NTV entities, naming and typing are explicit (NTVname and NTVtype) and could be accessible in a schema with two new keywords ("typeNTV" and "nameNTV"). These keywords place a constraint on the NTVname or on the NTVtype rather than on the NTVvalue.

For example, the following instances ([NTV_SCHEMA] - Example 3)

{"location": "paris", "dating:date": "2023-10-01"}
{"location": "paris", "dating:year": 2023}

are valid for the following schema:

{"properties": {
    "dating": {
        "typeNTV": {"enum": ["year", "date", "datetime"]}}},
 "items": {
    "nameNTV": {"maxLength": 10}}}

4.5. Use of "type" and "format" keywords

The "type" and "format" keywords address the same notion as the "typeNTV" keyword but cover two different uses:

It should be noted that in the previous example, the "dating" data can have several formats (the "year" format is not defined in JSON Schema and the "format" is not authorized for numeric data) but also several types ("integer" or "string").

4.6. Separation of keywords and pointers

To distinguish keywords from other names, we can consider that keywords are NTVtypes belonging to a NTV Namespace schema (noted "sch.").

With this option, all keywords are NTVtype and are preceded in JSON representation by the separator ":". This distinction simplifies the schema by making the use of the "properties" and "prefixItems" keywords optional.

For example, the previous diagram then becomes ([NTV_SCHEMA] - Example 4):

{"dating": {
    ":typeNTV": {':enum": ['year', 'date', 'datetime']}},
 ":items": {
    ":nameNTV": {":maxLength": 10}}}

Another option is to keep the naming of the keywords but to replace the names of the instance to be controlled by the associated pointer (including the separator).

For example, the previous diagram then becomes ([NTV_SCHEMA] - Example 5):

{"/dating": {
    "typeNTV": {'enum': ['year', 'date', 'datetime'] }},
 items": {
    "nameNTV": {"maxLength": 10}}}

This solution seems preferable because it does not call into question the current solution and makes the distinction between name (instance) and pointer (schema) more explicit.

It also allows you to use numeric pointers (eg "/0") and to be applicable to both NTV instances and JSON instances.

4.7. Using nested keywords

In the OpenAPI data schema, keywords identify an object. The objects are defined in a tree structure.

For example, the OpenAPI Object (root object) defines a "servers" field which is a set of "server" type objects.

The "server" object defines a "variables" field which is a set of "variable" type objects.

The "variable" object defines a "default" field which is a JsonString.

This tree of objects could be represented in the form of NTVtype:

The following example shown in the OpenAPI documentation ([NTV_SCHEMA] - Example 6):

{"example openAPI":{
   "servers": [
      {
        "url":"https://{username}.gigantic-server.com:{port}",
        "description": "The production API server",
        "variables": {
          "username": {
            "default": "demo",
            "description": "assigned by provider"},
          "port": {"enum": ["8443", "443"], "default": "8443"},
          "basePath": {"default": "v2"}}}]}}

is then translated by the following NTV formulation:

{"example:$openAPI.":{
    "servers.": [
      {
        ":url": "https://{username}.gigantic-server.com:{port}",
        ":description": "The production API server",
        "variables.": {
          "username": {
            ":default": "demo",
            ":description": "assigned by provider"},
          "port": {":enum": ["8443", "443"], "default": "8443"},
          "basePath": {":default": "v2"}}}]}}

In this example, the NTVtype of the entity with NTVvalue "demo" is:

and its NTVpointer is:

This organization separates the "fixed fields" from the other fields and is a first implicit validation of the OpenAPI specification.

It also allows you to use the NTVname to include comments.

For example :

{"example:$openAPI.":{
    "servers.": {
      "server1":{
        ":url": "https://{username}.gigantic-server.com:{port}",
        "prod:description": "The production API server",
        "variables.": {
          "username":{
            "user:default": "demo",
            ":description": "assigned by provider"},
          "port": {":enum": ["8443", "443"], "default": "8443"},
          "basePath": {":default": "v2"}}}}}}

Note: This extension is compatible with the other extensions presented previously.

5. References

5.1. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.
[RFC6901]
Bryan, P., Ed., Zyp, K., and M. Nottingham, Ed., "JavaScript Object Notation (JSON) Pointer", RFC 6901, DOI 10.17487/RFC6901, , <https://www.rfc-editor.org/info/rfc6901>.

5.2. Informative References

[NTV_SCHEMA]
Thomy, P., "Implementation NTV Schema", , <https://nbviewer.org/github/loco-philippe/NTV/blob/main/RFC/example_schema.ipynb>.
[JSON_NTV]
Thomy, P., "JSON semantic format (JSON-NTV)", , <https://datatracker.ietf.org/doc/draft-thomy-json-ntv/>.
[JSON_SCHEMA]
OpenJS Foundation, "JSON Schema specification", 17 December 2015, <https://json-schema.org/specification>.
[OAS]
OpenAPI Initiative, "OpenAPI Specification", 15 February 2021, <https://github.com/OAI/OpenAPI-Specification/>.

Appendix A. JSON Schema keyword and NTV data

This section describes the application of control keywords to NTV instances.

type: schema applicable to NTVvalue

{ "type": ["number", "string"] }

check if the NTVvalue is a number or a string

length, format, regex: schema applicable to NTVvalue

{ "type": "string", "minLength": 2, "maxLength": 3 }

check the length of the NTVvalue

multiples, range: schema applicable to NTVvalue

{ "type": "number", "multipleOf" : 10 }

check if the NTVvalue is a multiple of 10

properties: schema applicable to the NTVvalue of the NTV entity defined by his relative ntv-pointer

{ "properties": { "street_name": value_schema } }

check the value_schema for the NTVvalue of the NTV entity
defined by "street_name" (relative ntv-pointer)

patternProperties: schema applicable to the NTVvalue of the NTV entity whose relative ntv-pointer matches a pattern.

{ "patternProperties": { "^S_": pat_schema } }

check the pat_schema for the NTVvalue of the NTV entity
whose relative ntv-pointer matches "^S_"

additionalProperties: schema applicable to the NTV entity not listed in the properties or patternProperties.

{ "additionalProperties": add_schema }

check the add_schema for the NTVvalue of the NTV entity
not listed in the properties or patternProperties.

propertyNames: schema applicable to the NTVname of the NTV entities

{ "propertyNames": names_schema }

check the names_schema for the NTVnames of the NTV entity

minProperties, maxProperties: check the number of NTV entities included in the NTV entity

required: check if the relative ntv-pointer are present in the schema

unevaluatedProperties: same as additionalProperties

{ "minProperties": 2, "maxProperties": 3 }

check the number of NTV entities

items: schema applicable to the NTV entities included in the NTV entity

{ "items": items_schema }

check the items_schema for the NTVvalue
        of the NTV entities included in the NTV entity.

uniqueItems: schema applicable to the NTVvalue of the NTV entities included in the NTV entity

{ "uniqueItems": true }

check the uniqueness of the NTVvalue of the NTV entities.

prefixItems: schemas applicable to the NTVvalue of the NTV entities included in the NTV entity

{ "prefixItems": [ item_schema ] }

check the item_schema for the NTVvalue of the corresponding
NTV entity included in the NTV entity

contains, maxContains, minContains: schema applicable to the NTVvalues of the NTV entities included in the NTV entity

{ "contains": cont_schema, "minContains": 2, "maxContains": 3 }

check if the cont_schema is valid for 2 or 3 NTVvalue
of the NTV entity included

unevaluatedItems: applies to any NTVvalues not evaluated by an items, prefixItems, or contains keyword

minItems, maxItems: equivalent to minProperties, maxProperties

Author's Address

Philippe THOMY
Loco-labs
476 chemin du gaf de Famian
84 500 BOLLENE
France