breus / json-masker Goto Github PK
View Code? Open in Web Editor NEWHigh-performance JSON masker library in Java with no runtime dependencies
License: MIT License
High-performance JSON masker library in Java with no runtime dependencies
License: MIT License
As suggested on reddit, we should make sure that json-masker can parse (and possibly mask) all inputs from https://github.com/nst/JSONTestSuite
Given the following configuration
maskKeys: $.a
maskJsonPaths: $.b, $.c
And the following json:
{
"a": "do not mask",
"b": "mask",
"c": "mask",
}
The result will be:
{
"a": "***",
"b": "***",
"c": "***",
}
The expected result should be:
{
"a": "do not mask",
"b": "***",
"c": "***",
}
The bug is caused by not separating keys and JsonPATH keys in the trie. This can be solved by either using a separate trie instance for JsonPATH keys or ascribing a type to the endOfWord
field in trie nodes (endOfJsonPathKeyWord
and endOfKeyWord
)
Currently, calling JsonMasker.mask(String or byte[])
can give two results:
The second case can be caused by multiple cases (e.g. IllegalStateException
or ArrayOutOfBoundException
).
This should be unified such that the API can give two results:
InvalidJsonException
runtime exception is thrownBasically: if no JsonPath is given as input, there is no need to track the current JsonPath at all. This would improve the performance in case no JsonPaths should be masked.
We want to support wildcards for array and object matching:
$.a.*.b
which should mask
{
"a": [
{
"b": "masked",
"c": "allowed"
},
"allowed"
]
}
and
{
"a": {
"d": {
"b": "masked",
"c": "allowed"
},
"e": "allowed"
}
}
Support for recursive selectors $.a..b
is not planned as it would require matching from the tail and require a loopback.
Currently, the RandomJsonWhiteSpaceInjector
is used to have fuzzing tests that also have random white spaces injected in different places.
However, Jackson is not able to deal with certain white spaces which are injected that way. For example, unescaped cariage return cannot be used inside a JSON key in Jackson.
Right now, the Fuzzing test checks if the randomly generated, white space injected JSON can be parsed by Jackson. If it cannot, it is disregarded and a new random JSON is generated to do fuzzing testing on.
This however, is a quick fix we added and wastes computing resources which in turn decrease the number of fuzzing tests that can be run in a predetermined time frame (in our pipelines).
What we can do instead is implement a com.fasterxml.jackson.core.PrettyPrinter
instance that also randomly injects white spaces and use that instead of our current RandomJsonWhiteSpaceInjector
implementation.
With the introduction of wildcards, we want to disallow unambiguous JsonPath combinations. The main reason is to avoid loopbacks during matching that could occur for such combinations:
$.a.*.c
$.a.b.*
on a JSON
{
"a": {
"b": {
"d": "masked"
}
}
}
current implementation only matches forward, meaning that node "a.b" would be already matched against "a.*", but the next node "d" would fail to match the first JsonPath while satisfying the second JsonPath, which would require a loopback.
Until someone asks for this particular use case, providing a good reason, we should not allow these.
We currently use 2mb
JSON in the readme, which is a bit excessive, it would be better to update it to 1kb
.
Also makes sense to update GitHub action for benchmarking to show that for every PR
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.