Git Product home page Git Product logo

jschon's People

Contributors

handrews avatar ikonst avatar marksparkza avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

jschon's Issues

Making oneOf/anyOf schema evaluation easier by discriminator value

I am having a use case where I have a large number of schemas in a oneOf/anyOf schema. In such a case it would be easier if I could use something like openapi discriminator to hint which schema to choose. Is there anyway I can customise the validation so that whenever there is a discriminator, the normal oneOf/anyOf validation does not happen and the validation happens based on the discriminator mapping?

While running tests: ImportError while loading conftest

This happens when I run via tox or inside a tox devenv.

OS: NixOS
tox version: 3.19.0

ImportError while loading conftest '/home/rjmill/Development/open-source/jschon/tests/conftest.py'.
tests/__init__.py:4: in <module>
    jsonschema_2019_09.initialize()
jschon/catalogue/jsonschema_2019_09.py:98: in initialize
    Catalogue.create_metaschema(
jschon/catalogue/__init__.py:65: in create_metaschema
    metaschema_doc = cls.load_json(metaschema_uri)
jschon/catalogue/__init__.py:43: in load_json
    return load_json(filepath.with_suffix('.json'))
jschon/utils.py:26: in load_json
    with open(filepath) as f:
E   FileNotFoundError: [Errno 2] No such file or directory: '/home/rjmill/Development/open-source/jschon/jschon/catalogue/json-schema-spec-2019-09/schema.json'

Failing $dynamicRef cases on main

Running pytest, 2 failures occur:

FAILED tests/test_suite.py::test_validate[2020-12 -> dynamicRef.json -> A $dynamicRef should resolve to the first $dynamicAnchor that is encountered when the schema is evaluated -> An array containing non-strings is invalid]
FAILED tests/test_suite.py::test_validate[2020-12 -> dynamicRef.json -> A $dynamicRef with intermediate scopes that don't include a matching $dynamicAnchor should not affect dynamic scope resolution -> An array containing non-strings is invalid]

Catalogue error when loading schema

Whenever I try to load a schema, I get the error below.

jschon.exceptions.CatalogueError: File not found for 'https://json-schema.org/draft/2019-09/schema'

I have tried creating a default catalogue with Catalogue.create_default_catalogue('2019-09') but then I get the following:

AttributeError: type object 'Catalogue' has no attribute 'create_default_catalogue'

Not sure if this is a bug or if I'm doing something wrong.

Python 3.7 support?

I'd love to use this in a python 3.7 project. (Unfortunately, I can't upgrade the Python version for that project.)

I'm happy to open a PR for it if you're okay with supporting 3.7.

Support for Python 3.11?

I've run the tests with Python 3.11.2:

  • the main tests all pass
  • --testsuite-optionals has the same pass/fail numbers as with 3.10
  • --testsuite-formats has the same pass/fail numbers as with 3.10

I did not looking into the optionals/formats errors in detail, but it seems like the behavior is the same on 3.10 and 3.11.

Would there be anything more to supporting 3.11 than updating the version list in setup.py an adding py311 to tox.ini? Is that worth doing?

Validating Yaml (with JSON schema) like Visual Studio Code

Hi @marksparkza,
I am trying to generate a list of errors on a yaml file (with a JSON schema) to check for issues before yaml writers check in their file. They are already using the JSON schema on vscode, but of course they can ignore the errors :)
Can I use jschon to read in a yaml file and generate a list of issues like those from vscode?
Thanks in advance!

Tests for malformed [Relative] JSON Pointers

I might be missing something because I'm not familiar with hypothesis and how you're using it to generate test cases, but it looks like malformed JP/RJP aren't being covered. I looked at the creation tests and those all seem to test valid forms.

I messed up the regex changes for index adjustment (#47) in an astonishing number of ways before sorting it out, so I wanted to cover those problems. I wrote up some test cases for both regular and relative JSON Pointers in a gist.

These could be handled in a similar way as the evaluation tests. They could even be folded into that by adding a <malformed> outcome and wrapping the construction in a try, but that gets a bit awkward as the starting point and the document to evaluate against are irrelevant. So it's probably better to have a separate data file and test function.

If this seems reasonable, I will submit a PR for it.

Need support for error message while forming a `Node` class object in `Jsonpatch`

Hi, I was working on removing property values from a json for a given JsonPointer using jsonpatch.remove from main branch / jsonpatch.apply_remove from v9.0

example_json = {
    "foo": variable
}
json_pointer = JsonPointer("/foo/{}".format(pointer))

Scenarios:

  1. While forming a Node class why do we have an assert statemet. In production all the assert statements will be removed.
  2. When the variable has a value of None or integer and assertion was removed In that case we will be getting an error as AttributeError: Node object has no attribute type in line 256
  3. Are we considering string as list of characters (Sequence). when the Variable has a value of String and the pointer has a integer value less than length of the string we will get a error as TypeError: str object doesn't support item deletion in line 260
  4. When the Variable has a value of String and the pointer has a integer value greater than or equal to the length of the string. we will get error as jschon.exceptions.JSONPatchError: Invalid array index {pointer_value} in line 219

Note:
In all the scenarios listed above we are giving a invalid JsonPointer for removing the value from json. Whether can we get a error response which says Invalid jsonpointer was passed {str(jsonpointer)}

Doubt:

  1. In line 210 why we are not checking whether the given key can be converted into integer or not by using isdigit()

Walrus operator

Please mention that your project needs at least python 3.8 because you use the "walrus operator"

additionalProperties "catches" all invalid schema

I've tested 0.6.0 and 0.7.1, and this issue occurs with both.

In my understanding (based on the text surrounding this example), the additionalProperties keyword is used to validate items whose names aren't listed in properties or patternProperties. I have a schema like this:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$id": "https://example.com/greeting",
  "type": "object",
  "properties": {
    "greeting": {
      "type": "object",
      "properties": {
        "foo": {
          "type": "string"
        }
      },
      "required": [
        "foo"
      ],
      "additionalProperties": false
    }
  }
}

And a text like this:

{
  "greeting": {
                "foo": 0
  }
}

The resulting detailed output is:

{'valid': False,
 'instanceLocation': '',
 'keywordLocation': '',
 'absoluteKeywordLocation': 'https://example.com/greeting#',
 'errors': [{'instanceLocation': '/greeting',
             'keywordLocation': '/properties/greeting',
             'absoluteKeywordLocation': 'https://example.com/greeting#/properties/greeting',
             'errors': [{'instanceLocation': '/greeting/foo',
                         'keywordLocation': '/properties/greeting/properties/foo/type',
                         'absoluteKeywordLocation': 'https://example.com/greeting#/properties/greeting/properties/foo/type',
                         'error': 'The instance must be of type "string"'},
                        {'instanceLocation': '/greeting',
                         'keywordLocation': '/properties/greeting/additionalProperties',
                         'absoluteKeywordLocation': 'https://example.com/greeting#/properties/greeting/additionalProperties',
                         'error': 'The instance is disallowed by a boolean '
                                  'false schema'}]}]}

You can see that an error occurs because the type of foo's value is not as expected (an integer instead of a string), but there's also an error due to the additionalProperties keyword. I don't think the latter should be present. Other experiments I've done suggest that additionalProperties errors are being triggered by all schema that fail to validate, rather than just the presence of an unexpected property.

Reconsider conversion of floats to decimals

Currently, floats are parsed as decimal.Decimal during JSON deserialization and JSON class construction. This approach was taken to ensure reliable functioning of the multipleOf JSON Schema keyword. However, decimals are not natively JSON-serializable, and this can impact applications that wish to serialize output produced by this library. Consider the following example:

import json
from jschon import JSON, JSONSchema, create_catalog

create_catalog('2020-12')

schema = JSONSchema({
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "$id": "https://example.com/schema",
    "default": 1.0
})

output = schema.evaluate(JSON(True)).output('basic')
print(output)

The output contains a Decimal object:

{'valid': True, 'annotations': [{'instanceLocation': '', 'keywordLocation': '/default', 'absoluteKeywordLocation': 'https://example.com/schema#/default', 'annotation': Decimal('1.0')}]}

If we try to serialize this using the standard json library:

print(json.dumps(output))

we get an exception:

TypeError: Object of type Decimal is not JSON serializable

jschon does provide a utility function that is used internally for stringifying JSON objects, which handles serialization of decimals, so we could say:

from jschon.utils import json_dumps
print(json_dumps(output))

However, there may be situations in which an application does not have direct control over output serialization. An example of this is assigning the output of an evaluation directly to a PostgreSQL JSONB column using SQLAlchemy - in this case, an exception will occur if the output contains any decimals.

A better approach might be just to convert floats to decimals internally as needed - such as during multipleOf evaluation - and otherwise remove support for decimal.Decimal from the jschon API.

Use Local Schemas only

Not sure if this is an issue or what, but I am trying to load load a directory of schemas (these schemas don't have any $id tags and are referenced only via local file pointers in $ref tags - so I am not sure how to set the URI property accurately).

I feel like it should be able to auto load relative schemas via file paths (or at least attempt to given file permissions, etc, etc) and I shouldn't have to load each individually referenced schema by hand.

I saw in version 0.8.0, there is a LocalSource, but that still requires a URI, which are invalid based on my current file paths.

Any tips appreciated ,thanks.

Object properties are only evaluated against core vocabulary when `$dynamicAnchor` is in place

If you have a schema where you have the following

{
  "$dynamicAnchor": "some_anchor",
  "type": "object",
  "properties": {
    "first": {
      "type": "not a valid type"
    }
  }
}

This schema will pass validation against its metaschema. Diving into the debugger, it seems that the individual properties (i.e. first) are only evaluated against the core vocabulary. Since type is not defined in the core vocabulary, the content of the type field is never evaluated.

I have only tested this with $dynamicAnchor so not sure if regular $anchor also produces the same result. I have also only tested this with 2020-12. I can add more details later if necessary.

Roadmap

It would be really helpful if you can publish a roadmap to v1.0 along with the current production readiness of the package.

Unable to install jschon via pip

Hi,

If I try to install jschon I get this error:

$ pip3 install jschon
ERROR: Could not find a version that satisfies the requirement jschon (from versions: none)
ERROR: No matching distribution found for jschon

And if I specify a version too:

$ pip3 install jschon==0.2.0
ERROR: Could not find a version that satisfies the requirement jschon==0.2.0 (from versions: none)
ERROR: No matching distribution found for jschon==0.2.0

I have tried both pip and pip3 but I get the same result.

Detect embedded "$id"/"$anchor"/"$dynamicAnchor" as valid "$ref"/"$dynamicRef" targets

This scenario (which is apparently not tested int he JSON Schema test suite) is not currently supported:

{
  "$id": "https://example.com/schemas/root"
  "$defs": {
    "foo": {
      "$id": "foo",
      "type": "string"
    }
  },
  "$ref": "foo"
}

The distinction here is that evaluation never passes through "#/$defs/foo" before it is referenced. The JSON Schema test suite only tests embedded "$id"s that are evaluated before they are referenced, apparently.

There are essentially two ways to do this: Walk all applicator and schema location ("$defs") keywords at load time and index the "$id"s, or trigger a scan of loaded schemas upon finding a "$ref" that cannot otherwise be resolved.

I am personally inclined towards scanning up front, which also makes some obscure cases involving embedded "$schema" (which I'll file separately) if jschon wants to support them. But there are pros and cons either way.

Add common prefix check to JSONPointer

It would be really nice to be able to check via a single method call if one pointer points to the container/parent of another.

Perhaps even via operator overloading, though I'm not quite sure on the implications of that, or if I overlook some edgecases. For example:

ab = JSONPointer("/a/b")
abc = JSONPointer("/a/b/c")

assert ab < abc  # good?
assert ab in abc  # better?
assert abc - ab == JSONPointer("c")  # allow even this with relative pointers?

Document how to pre-populate the catalog more clearly

[EDIT: This whole bug is a somewhat hilarious account of me not understanding what I'm doing. There's no actual problem here, just jump to the last comment for the explanation.]

I've written the following in three different projects now and was wondering if something like it could be added alongside LocalSource and RemoteSource:

class InMemorySource(jschon.catalog.Source):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self._registry = {}

    def __call__(self, relative_path):
        return self._registry[relative_path]

    def register(self, relative_path, schema_doc):
        self._registry[relative_path] = schema_doc

There are several use cases:

  • Sometimes I'm building/modifying schemas as Python data structures and they never live anywhere else
  • If you use URI schemes like tag, urn, or about for your "$id"s, the URI and filesystem (or network URL path) structures are less likely to correlate โ€” it's easier to just manage separately
  • In higher-security environments, doing any sort of on-demand filesystem or (especially) network I/O on demand during a process like schema validation will not pass audit. All I/O and parsing would be done up front with specific checks, which might not be easy to integrate into a Source even if doing them on-demand was considered OK
  • Performance is more consistent when no I/O is done during schema evaluation

File not found for GeoJSON schema

Hello!

I'm trying to validate a file against a schema I made myself, and I'm referencing other schemas in my $defs:

"$defs": {
   "GeoJSON_LineString" : {
     "$id": "linestring",
     "$ref": "https://geojson.org/schema/LineString.json"
   },
   "GeoJSON_Point" : {
     "$id": "point",
     "$ref": "https://geojson.org/schema/Point.json"
   },
   "GeoJSON_MultiLineString" : {
     "$id": "multilinestring",
     "$ref": "https://geojson.org/schema/MultiLineString.json"
   }

I get an error: jschon.exceptions.CatalogueError: File not found for 'https://geojson.org/schema/LineString.json'
Do you have an idea to overcome this?

Embedded "$schema" with different values in draft 2020-12

2019-09 and 2020-12 both allow "$schema" alongside an embedded "$id". 2019-09 states that all such embedded "$schema" keywords SHOULD have the same value as the "$schema" in the document root, but 2020-12 allows changing it, which requires some finagling when it comes to validating schemas against their meta-schemas.

Changing the value of "$schema" in an embedded resource is an unusual use case and, AFAICT, not understood by most people much less supported - it was meant to support complex bundling use cases, and one can always un-bundle the schemas to work around lack of support. Raising a NotImplementedError with a clear message would probably not be unreasonable. The JSON Schema test suite does not test this functionality at this time.

Format validation

As the JSON-Schema spec states, "format" is just annotation by default. Implementations may offer to treat it as a validation keyword. This luckily is also possible in jschon via Catalog.add_format_validators.

Two things:

  1. Question: In the spec a list of "built-in" formats is defined, referencing ISO, RFC and other definitions. Any chance or plans jschon will bring its own FormatValidator implementations for all predefined formats? Or is this out of scope?

  2. Docs bug (?): In the jschon docs example for "ipv4", the Python built-in ipaddress is used as a FormatValidator. While easy, that seems incorrect. The Python built-in ipaddress.IPv4Address does not quite accept everything that is valid according to the referenced RFC. For example "127.000.000.001" will be rejected by Python. Similar could be true in the examples for "ipv6" and "hostname", but I did not look into that.

Scope information for additional properties

Hi Mark, thanks for creating this library. Recently I've made a little utility called jschon-sort which leverages jschon's very useful feature of having the evaluation's result map evaluated values back to the schema that validated them.

One edge case I've noticed is that additionalProperties, when present as "additionalProperties": true (rather than being true by default), don't map to the properties that they validated. (TBH I haven't tested whether the same holds for pattern properties, etc.)

"properties" in "allOf" and "additionalProperties"

This is probably me not understanding how to use this combination of keywords properly. I assumed the following schema would be valid against the JSON. This is a silly example to demonstrate the issue. My real use of this was to add a "additionalProperites": false after this "allOf": https://github.com/cancerDHC/data-model-harmonization/blob/3e4639517586f8b260e8052417e6f32e66603c00/json-examples/home/shahim/schema-testing/schemas/testing-schema.json#L5816

Should the use of "properties" inside an "allOf" not annotate the property names used in the "allOF" so that the later "additionalProperties": false only fails for any other properties not handled in the "allOf"?

The demo schema:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$id": "http://ccdh/json/schema/testing",
  "allOf": [
    {
      "properties": {
        "foo": {}
      }
    },
    {
      "properties": {
        "bar": {}
      }
    }
  ],
  "additionalProperties": false
}

The JSON:

{
    "foo": "foo",
    "bar": "bar"
}

jschon.dev output:

The instance is invalid.
{
    "valid": false,
    "instanceLocation": "",
    "keywordLocation": "",
    "absoluteKeywordLocation": "http://ccdh/json/schema/testing#",
    "errors": [
        {
            "instanceLocation": "",
            "keywordLocation": "/additionalProperties",
            "absoluteKeywordLocation": "http://ccdh/json/schema/testing#/additionalProperties",
            "error": "The instance is disallowed by a boolean false schema"
        }
    ]
}

Loading a metaschema from disk doesn't load as a metaschema but instead as a JSONSchema

I've defined a collection of vocab and a metaschema on disk using json files.

I'm trying to load these into the jschon catalog, and then I also want to load the schema for my data source.

When I load the schema, it raises the following error:

jschon.exceptions.JSONSchemaError: The schema referenced by https://example.com/metaschema is not a metachema

Clarification on multi-file usage

Hi there. I'm just evaluating ways of validating some json documents, and have a bunch of schemas in a directory. For example we have an org schema, which references a person schema, which references an address schema. We have these in different schema-foo.json files.

How would I load these into jschon in a way that I can then later feed in some data.json and validate it's correctness? At the moment if I iterate over the files and try to add the schemas to a catalog using add_schema I still get problems with unfound schemas.

I suspect I'm just getting confused with what to put for base_uri and schema to tell jschon how to find my schemas.

Do you have any examples anywhere of loading in a bunch of schemas and then using those to validate some data?

Test issue with hypothesis>=6.0.4 and python>=3.10

While testing a planned PR submission I noticed that while everything runs fine on Python 3.8 and 3.9, on 3.10 the tests complete successfully but hang without proceeding to the coverage report. Running pytest alone from the same virtualenv also hangs.

I eventually tracked this down the use of hs.recursive() in jschon.tests.strategies, and determined that hypothesis 6.0.3 and earlier are fine, but 6.0.4 or later introduce the hang. The changelog for 6.0.4 shows that its only change attempted to fix a race condition in recursive() (issue HypothesisWorks/hypothesis#2717, PR HypothesisWorks/hypothesis#2783).

The fix has to do with multiple threads, and if there are multiple threads involved here then I'm a bit out of my depth- I haven't used threads in Python in over a decade, if ever (I honestly can't recall). If threads are not involved, it's not clear to me what the problem is. Google has not been helpful so far.

I'm filing this here to record the investigation so far and in hopes that maybe @marksparkza might know where to go next. I have not tried to figure out if there are calls to recursive() that work, I just know that if you comment the calls out of jschon.tests.strategies then things don't hang. Otherwise, importing the module causes a hang whether the test uses anything from it or not.

The problem also seems to happen on Python 3.11 although I have not investigated that in detail.

Restricting hypothesis<6.0.4 in tox.ini lets the tests path on all releases of Python (including 3.11).

Add support for "list" output

There are two new output formats proposed for the next version of JSON Schema, with new names so they can be added alongside the existing ones, which will be deprecated (except for "flag"). There is no technical reason not to support them for 2019-09 and 2020-12 as well.

This issue tracks the "list" format, for which I am likely to submit a PR assuming this is deemed desirable to do now.

Detailed "error" output can (should?) contain the target $ref URI when the target is boolean false

The following output is not very helpful when the "error" doesn't mention which $ref is the "a boolean false schema". The output is correct but eventually the "allOf" will contain many similar entries and it's helpful to know the URI of the $ref instead of having to count positions in the "allOf" in the schema to look it up.

    "errors": [
        {
            "instanceLocation": "/something",
            "keywordLocation": "/$ref/allOf/1/additionalProperties/then/$ref/allOf/0/then/then/$ref",
            "absoluteKeywordLocation": "http://ccdh/json/schema/testing#/$defs/test-run.impl/allOf/0/then/then/$ref",
            "error": "The instance is disallowed by a boolean false schema"
        }
    ]

This error is coming from: https://github.com/cancerDHC/data-model-harmonization/blob/3e4639517586f8b260e8052417e6f32e66603c00/json-examples/home/shahim/schema-testing/schemas/testing-schema.json#L5818

Due to this false boolean schema: https://github.com/cancerDHC/data-model-harmonization/blob/3e4639517586f8b260e8052417e6f32e66603c00/json-examples/home/shahim/schema-testing/schemas/testing-schema.json#L151

I'll have many stub "false" schemas like this until someone defines the validation logic that should be coded in that schema. In the mean time, I just want any use of those schemas to fail but it would help if the error output includes the schema URI for debugging purposes.

Maybe the spec requires the output to be as it is but I thought I'll ask.

ImportError: cannot import name 'LocalSource' from 'jschon'

@marksparkza Thank you for this wonderful package.

I am using jschon-0.7.3

I am trying to use this example as a template - https://github.com/marksparkza/jschon/blob/main/examples/load_from_files_2.py but I am seeing import errors.

I can see the LocalSource class in https://github.com/marksparkza/jschon/blob/main/jschon/catalog/__init__.py but the class doesn't exist in locally installed package on my machine.

I guess the pip package isn't updated to the latest code. Are there any plans to push the latest code to PyPI?

Errors are being lost for all except the last item in an array

Child scopes for a given keyword (e.g. "type" in the example below) under the "items" scope are being successively replaced for each array item, at

self.children[key] = (child := Scope(

This does not affect overall instance validation, as the "items" scope itself is collecting (generic) errors for each failed item, but the originating errors from keywords within the "items" subschema are missing from the output.

Here is a minimal working example of the problem:

import pprint

from jschon import Catalogue, Evaluator, JSON, JSONSchema, OutputFormat, URI

Catalogue.create_default_catalogue('2020-12')

schema = JSONSchema({
    "items": {"type": "integer"}
}, metaschema_uri=URI("https://json-schema.org/draft/2020-12/schema"))

invalid_json_1 = JSON([1, 'foo'])
invalid_json_2 = JSON(['bar', 2])

evaluator = Evaluator(schema)
result_1 = evaluator.evaluate_instance(invalid_json_1, OutputFormat.BASIC)
result_2 = evaluator.evaluate_instance(invalid_json_2, OutputFormat.BASIC)

print('---Result 1---')
pprint.pp(result_1)

print('---Result 2---')
pprint.pp(result_2)

This produces the following output:

---Result 1---
{'valid': False,
 'errors': [{'instanceLocation': '',
             'keywordLocation': '',
             'absoluteKeywordLocation': 'urn:uuid:3c42e4b1-9682-4fc0-94e5-8c9575428f05#',
             'error': 'The instance failed validation against the schema'},
            {'instanceLocation': '/1',
             'keywordLocation': '/items',
             'absoluteKeywordLocation': 'urn:uuid:3c42e4b1-9682-4fc0-94e5-8c9575428f05#/items',
             'error': 'The instance failed validation against the schema'},
            {'instanceLocation': '/1',
             'keywordLocation': '/items/type',
             'absoluteKeywordLocation': 'urn:uuid:3c42e4b1-9682-4fc0-94e5-8c9575428f05#/items/type',
             'error': 'The instance must be of type "integer"'}]}
---Result 2---
{'valid': False,
 'errors': [{'instanceLocation': '',
             'keywordLocation': '',
             'absoluteKeywordLocation': 'urn:uuid:3c42e4b1-9682-4fc0-94e5-8c9575428f05#',
             'error': 'The instance failed validation against the schema'},
            {'instanceLocation': '/0',
             'keywordLocation': '/items',
             'absoluteKeywordLocation': 'urn:uuid:3c42e4b1-9682-4fc0-94e5-8c9575428f05#/items',
             'error': 'The instance failed validation against the schema'}]}

In case 1, the failing item is at the end of the array, and its originating error appears in the output. In case 2, the failing item comes first, and its error is lost.

Modify error message for enum

Hi,

Would it be possible to modify the error message in the case of enum type?
In case of a value not in the list of allowed possibilities, I would like to be able to embed suggestions in the error message using the levenshtein distance, like "Did you mean ...?"
Please can you suggest a way of doing this, if possible?
Many thanks.

Inspecting the state of the catalog before calling `create_metaschema` corrupts the catalog.

when following the example provided in the docs: https://jschon.readthedocs.io/en/latest/examples/extending_json_schema.html

I was trying to understand how a metaschema differs from a schema.

I wanted to know what a schema looks like before and after the call to create_metaschema.

Well as it turns out if you call

schema = catalog.get_schema(URI("https://example.com/enumRef/enumRef-metaschema"))

before calling

catalog.create_metaschema(
    URI("https://example.com/enumRef/enumRef-metaschema"),
    URI("https://json-schema.org/draft/2020-12/vocab/core"),
    URI("https://json-schema.org/draft/2020-12/vocab/applicator"),
    URI("https://json-schema.org/draft/2020-12/vocab/unevaluated"),
    URI("https://json-schema.org/draft/2020-12/vocab/validation"),
    URI("https://json-schema.org/draft/2020-12/vocab/format-annotation"),
    URI("https://json-schema.org/draft/2020-12/vocab/meta-data"),
    URI("https://json-schema.org/draft/2020-12/vocab/content"),
    URI("https://example.com/enumRef"),
)

it ends up breaking the example.

  1. the enumRef-metaschema never gets registered as a metaschema.
  2. The validations all fail (both the valid json example as well as the invalid json example).

This seems wrong.... I don't know whats happening inside the catalog, but inspecting the catalog state should not break it!

This is related to: #39

Getting plain data from JSON object

The JSON class offers the property value, to get the underlying value as a Python data type.

My problem is: dict's and list's content are JSON instances again. Do I need to call .value recursively myself, or am I missing something?

Else: this would be a nice feature to have.

Cannot reproduce doc example from fresh pip install

% python3 -m pip install --user jschon

% python3
Python 3.9.2 (default, Feb 28 2021, 17:03:44) 
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from jschon import JSONSchema
>>> int_schema = JSONSchema({"type": "integer","$schema": "https://json-schema.org/draft/2020-12/schema"})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/julien/.local/lib/python3.9/site-packages/jschon/jsonschema.py", line 73, in __init__
    raise JSONSchemaError("catalogue not given and default catalogue not found")
jschon.exceptions.JSONSchemaError: catalogue not given and default catalogue not found
>>> 

Validate Schema against meta Schema

I'd like to validate data Json that is a JSON schema again the meta-schema and get the same error report.

I tried this:

def schema_validate_schema(schema):
    catalog = create_catalog('2020-12')
    try:
        validator = JSONSchema(schema, catalog=catalog)
    except Exception as e:
        print(f'Error {e}')

    # Validate the schema against the meta-schema
    try:
        schema_validity = validator.validate()
        print(f'Schema validity check: {schema_validity.valid}')

    except Exception as e:
        print(f'Error {e}')

and that sort of works. But i really want this:


def schema_validate_data(schema, data):
    catalog = create_catalog('2020-12')

    meta_schema = URI("https://json-schema.org/draft/2020-12/schema")

    try:
        meta_validator = JSONSchema(meta_schema, catalog=catalog)
    except Exception as e:
        print(f'Error {e}')

   # Validate the Schema first
    try:
        result = validator.evaluate(JSON(schame)).output('basic')
    except Exception as e:
        print(e)
        return None

I just tried this using the jsonschema package

def schema_validate_schema(schema):    
    try:
        res = Draft202012Validator.check_schema(schema)
    except SchemaError as e:
        return e.message

    return res

I passed it:

{
    "name": "person",
    "schema": {
        "$id": "https://example.com/person.schema.json",
        "$schema": "https://json-schema.org/draft/2020-12/schema",
        "title": "Person",
        "type": "WRONG",
        "properties": {
            "firstName": {
                "type": "string",
                "description": "The person's first name."
            },
            "lastName": {
                "type": "string",
                "description": "The person's last name."
            },
            "age": {
                "description": "Age in years which must be equal to or greater than zero.",
                "type": "integer",
                "minimum": 0
            }
        }
    }
}

it returns

"'WRONG' is not valid under any of the given schemas"

which is not what I need. Id like the better error messages that jschon produces.

##################
UPDATE
##################
this function now produces what I need:

async def create(request: Request):
    data = await request.json()

    try:
        objdef = await request.app.state.mushroom.create_objdef(data)
    except MushroomValidationError as validation_error:
        error = {
            "status": False,
            "method": request.method,
            "path": request.url.path,
            "code": validation_error.code,
            "message": validation_error.message
        }
        raise HTTPException(status_code=404, detail=error)

    return objdef
POST http://localhost:8000/mushroom/objdef/
content-type: application/json

{
    "name": "person",
    "schema": {
        "$id": "https://example.com/person.schema.json",
        "$schema": "https://json-schema.org/draft/2020-12/schema",
        "title": "Person",
        "type": "WRONG",
        "properties": {
            "firstName": {
                "type": "WRONG",
                "description": "The person's first name."
            },
            "lastName": {
                "type": "string",
                "description": "The person's last name."
            },
            "age": {
                "description": "Age in years which must be equal to or greater than zero.",
                "type": "integer",
                "minimum": 0
            }
        }
    }
}

[
  {
    "index": 0,
    "path": "",
    "message": "'WRONG' is not one of ['array', 'boolean', 'integer', 'null', 'number', 'object', 'string']",
    "instance": "WRONG",
    "json_path": "$.properties.firstName.type"
  },
  {
    "index": 1,
    "path": "",
    "message": "'WRONG' is not of type 'array'",
    "instance": "WRONG",
    "json_path": "$.properties.firstName.type"
  }
]

Configure annotations

Since a schema may hold annotations useful to different tools, it would be really helpful if we can specify the annotations that needs to be included or excluded for processing. This way, all the irrelevant annotations can be filtered out while initialising the schema.

Cannot create catalog

Hi there,

After installing jschon, trying a few given examples and running the command create_catalog('2020-12') I get the following error

get_catalog
raise CatalogError(f'Catalog name "{name}" not found.')
jschon.exceptions.CatalogError: Catalog name "catalog" not found.

Thank you in advance.

Regards,
Marcel Hofman

Ability to integrate large enums through custom code

I'm evaluating the feasibility of using JSON Schema 2020-12 and jschon for data validation for a project that will include enumerations that contain thousands of codes from medical terminologies. I have not yet looked through jschon's API or source code to understand how I might be able to integrate custom code to apply any validation for these specific enumerations so they don't have to be listed in the schemas themselves. Also, any error output can't list the possible values. Instead, it would be enough to mention the enum $id and users will understand what that means.

I'm also still learning JSON Schema 2020-12, and so far I've only used jschon through jschon.dev. I'm assuming that if I always $ref such enum schemas with their $id, I might be able to extend jschon to recognize these $refs and apply custom logic for those specific $refs/$ids. The values for these enums will be obtained and cached from independent and remote terminology services.

Does jschon have an extension point to help with such integrations? If not, what are my best options? Any other JSON Schema implementation of 2020-12 that I should be considering that might better accommodate such a use case?

Thank you in advance for any suggestions.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.