jmespath-community / jmespath.test Goto Github PK

View Code? Open in Web Editor NEW

0.0 0.0 2.0 100 KB

JMESPath compliance test suite

Python 100.00%

jmespath.test's People

Contributors

Watchers

Forkers

gibson042

jmespath.test's Issues

`raw-string-escape` compliance test is incorrect

Migrated from #14.

The following compliance test is incorrect:

jmespath.test/tests/literal.json

Lines 193 to 197 in aa6fb5f

 { 

 "comment": "Backslash not followed by single quote is treated as any other character", 

 "expression": "'\\\\'", 

 "result": "\\\\" 

 }

Instead, this should be

Expression	Expressed as JSON	Result	Comment
`'\\'`	`'\\\\'`	`"\\"`	should follow `raw-string-escape` grammar rule

Suggested change:

         {
-          "comment": "Backslash not followed by single quote is treated as any other character",
+          "comment": "Can escape backslash",
           "expression": "'\\\\'",
-          "result": "\\\\"
+          "result": "\\"
         }

Test coverage needed for `find_first` and `find_last`

...especially with a variety of $start and $end argument values.

cf. jmespath-community/jmespath.spec#119 and jmespath-community/jmespath.spec#124

Test coverage needed for `pad_right`

Missing test coverage for:

checking length of the padding string.
empty / undefined subject and padding strings

Pipe Expressions do not "correctly" handle `null` values on their left-hand-side

Sub Expression

The specification for sub-expression outlines how it should behave in pseudocode:

left-evaluation = search(left-expression, original-json-document)
result = search(right-expression, left-evaluation)

However, this is incorrect, as many compliance tests expect the result to be null when the left-hand-side evaluates to  null.
So, the real pseudocode shoud in fact be:

left-evaluation = search(left-expression, original-json-document)
if left-evaluation is `null` then result = `null`
else result = search(right-expression, left-evaluation)

Pipe Expression

However, it seems intuitive for pipe-expression to behave as is specified by the pseudocode above.

left-evaluation = search(left-expression, original-json-document)
result = search(right-expression, left-evaluation)

Which means that the evaluation should still happens if the left-hand-side is null.

Summary

Please, consider standardizing pipe-expression to behave like so:
search ( `null` | [@], {} ) -> [ null ]

Unicode coverage is incomplete

unicode.json does not contain any supplementary plane code points (i.e., U+10000 through U+10FFFF). This is a rather large gap for a specification in which a string is a sequence of code points, and as a result there are unsurprising bugs in e.g. the jmespath.js implementation in use at jmespath.site, which incorrectly treats each supplementary plane code point as if it were a sequence of two surrogate code points (i.e., a code point from U+D800 through U+DBFF followed by a code point from U+DC00 through U+DFFF):

expression	actual result	expected result
`length('𝌆')`	`2`	`1`
`reverse('a𝌆b')`	`"b\udf06\ud834a"`	`"b𝌆a"`
`sort(['𝌆','北'])`	`[ "𝌆", "北" ]`	`[ "北", "𝌆" ]`

Lexical scoping issue with function `let()`

See jmespath-community/python-jmespath#15.

Test coverage needed for whitespace inside a backtick literal

cf. jmespath-community/jmespath.spec#130

Example: `1 `

Root scope should not have the "$" name as it clashes with a potentially existing property

jmespath-community/go-jmespath#44

New compliance test needed
search ( @."$".foo , {"foo": "bar" } ) -> null

We are clearly trying to access the $ field under the current node, but as it doesn't exist we fall back to root node.

Test coverage needed for integer-constrained parameters

cf. jmespath-community/jmespath.spec#124

`raw-string-char` grammar rule does not allow control characters

Migrated from #14.

The following compliance tests illustrated using control characters, such as line feed in raw-string literals.

jmespath.test/tests/literal.json

Lines 154 to 161 in aa6fb5f

 { 

 "expression": "'newline\n'", 

 "result": "newline\n" 

 }, 

 { 

 "expression": "'\n'", 

 "result": "\n" 

 },

Text	Expression	ABNF	Expressed as JSON	Result
`newline␊`	`` 'newline\n' ````	"'newline" %x0A "'" ``	`'\n'`	`"\n"`
`␊`	`'\n'`	`"'" %x0A "'"`	`'\n'`	`"newline\n"`

However, those expressions are not valid as per the raw-string-char grammar rule:

raw-string        = "'" *raw-string-char "'" 
raw-string-char   = (%x20-26 / %x28-5B / %x5D-10FFFF) / preserved-escape / raw-string-escape 
preserved-escape  = escape (%x20-26 / %x28-5B / %x5D-10FFFF) 
raw-string-escape = escape ("'" / escape)

Please consider refactoring the compliance tests to forbid the use of C0 control characters in JMESPath expressions, as it Seems that was never the intended spirit when introduced in JEP-12.

Note: this will break non-conforming implementations.

'foo'[::-1].length(@)

In most implementations, the rhs of a sub-expression is implemented differently when the lhs is a projection.
Due to the shape of the AST, the lhs is actually evaluated on the lhs projection.

However, when slicing a string, the rhs should be treated as a regular sub-expression.

As a result, in most implementations, the following expression does not evaluate correctly:

'foo'[::-1].length(@) should evaluate to 3.

In some implementations, the result is ``"oof"ornull`.

Please, consider adding a compliance test.

Pipe evaluation with multi-select-[list / hash]

Seems there are some inconsistent implementations, even in JMESPath Community.

search( `null` | [@] , null ) -> [ null ] ✔️
search( `null` | { foo: @ } ) -> null ❌

We think this MUST return {"foo": null} instead for consistency.

raw string literal testing disagrees with documentation

jmespath.site Grammar defines the grammar for raw string literals like

raw-string        = "'" *raw-string-char "'" 
raw-string-char   = (%x20-26 / %x28-5B / %x5D-10FFFF) / preserved-escape / raw-string-escape 
preserved-escape  = escape (%x20-26 / %x28-5B / %x5D-10FFFF) 
raw-string-escape = escape ("'" / escape)

and Raw String Literals includes a search('\\', "") -> "\\" example implying that \\ is an escape sequence representing a single U+005C REVERSE SOLIDUS just like \' is an escape sequence representing a single U+0027 APOSTROPHE.

However, tests/literal.json in this repository includes test cases contradicting the above, such as { "expression": "'\n'", "result": "\n" } (a raw string literal including a U+000A LINE FEED, which is not covered by raw-string-char) and { "comment": "Backslash not followed by single quote is treated as any other character", "expression": "'\\\\'", "result": "\\\\" } (added in 2016 by c0f7923).

Both disagreements ultimately affect whether or not raw string literals are capable of representing all possible sequences of code points (respectively those including C0 control characters and those including U+005C REVERSE SOLIDUS), and the latter is especially odd—I've never heard of any other escaping approach in which the escape prefix itself is unrepresentable.

[
  {
    "stack": "",
    "branch": [
      "one/",
      "two/"
    ]
  },
  { "branch": ["three/", "four/"]}
]

The following expression yields an incorrect result:

[?stack==''].branch[?starts_with(@, 'one')] -> invalid-type

In contrast, the following similar expression returns the expected result:

[].branch[?starts_with(@, 'one')] -> [ [ "one/" ], [] ]

Please consider adding a compliance test and fixing the libraries.

	{
	"comment": "Backslash not followed by single quote is treated as any other character",
	"expression": "'\\\\'",
	"result": "\\\\"
	}

	{
	"expression": "'newline\n'",
	"result": "newline\n"
	},
	{
	"expression": "'\n'",
	"result": "\n"
	},