manoj-pillay-10gen / simple-search-index-validator Goto Github PK
View Code? Open in Web Editor NEWValidate your full text search index definitions before submitting them.
License: MIT License
Validate your full text search index definitions before submitting them.
License: MIT License
As JSON in hierarchical in nature, we should take advantage of this to organize the schema files hierarchically using sub-directories.
For example,
This is currently not possible because in the following library function to load all schemas, the implementation only supports reading files at the current level and not nested levels of directories.
simple-search-index-validator/lib/schema.tsx
Lines 6 to 24 in c55024b
Tasks are to:
An example of very commonly seen business logic for property value validation is based on dependencies of one property on another. For example, here is the definition for an edgeGram
tokenizer
"tokenizer": {
"type": "edgeGram",
"minGram": 5,
"maxGram": 10
}
An expectation is that minGram
<= maxGram
. However, this rule is currently not possible to install directly using the schema, because there is no way to do this in JSON schema specification
We will need to install this rule during the validation phase (no squiggly red lines) of the index before it gets submitted to mongot unless we resort to dynamic injection (doesn't sound worthwhile for this).
We want to make sure that known search index definitions which can be found on docs.mongodb.com can be validated against our schema. Changes that break validation of these should be rejected using pre-commit hooks.
A challenging requirement but the summary is to let people define a custom analyzer in the editor, and then have that show up as a dropdown option when autocompleting for fields such as:
analyzer
searchAnalyzer
synonyms[i].analyzer
etc.By resolving #8 , we have solved the problem of the name of the custom analyzer defined at run time being legal for use as the value for analyzer
or other fields listed above. However, since JSON specification does not let reference to values of other properties, we are unable to provide a custom analyzer name defined at run-time show up as a dropdown option.
This will need to be dynamically injected in typescript as the schema is being defined.
Suggestions appear in alphabetical order at this time. Find a way to influence the ordering of suggestions to influence the path that the user takes to complete their index definition work.
mongot is the source of truth for index definitions. They have a curated list of index definitions that we could use as tests on our schema. To start with import them and subject them to validation as part of the existing pre-commit hook.
Currently, we have to duplicate code between index.json and fullIndex.json . Instead. there may be a way to extend siblings of a schema to eliminate duplication and drift caused from maintaining redundant copies.
fullIndex.json is more than ornamental, because the API which needs the auto-generate YAML spec requires the additional components available in fullIndex.json.
This makes for good reference. Organize by file structure.
Guesswork has not been helpful. Certain parts of the light theme are very difficult to see.
The goal is to ensure that json that is not legal cannot be added as a schema file in the /schema
directory. Should catch this as part of commit.
It will soon be painful to maintain the consistency between schema files defined in MMS and simple-search-index-validator. A single source of truth is essential in order to make development easy. One approach may be to define the schema directory in MMS as a sub module of the simple search validator.
Caveat: Certain parts of the schema definition are diverging already. For example : index.json does not contain database or collection names in MMS but has those fields in simple search validator. We will need to find a way to have a divergent section and a convergent section for each submodule i.e there should be a way to define index.json for simple-search-validator that overrides the base copy available in MMS that will be original source of truth.
There are many *.invalid files added such as in https://github.com/manoj-pillay-10gen/simple-search-index-validator/tree/4ca4669f4a15a40c63855aaa533ac1c198122e53/data/sampleData/JSONEditorSamples/autocomplete . These are files that are valid according to mongot syntax definitions but invalid according to our schema definition.
Two paths for resolution:
for f in $(find . -type f -maxdepth 1 -name "*.invalid"); do mv $f `basename $f .invalid`; done
in each of the directories with invalid filesCurrently, we are not permitting empty arrays for either of tokenFilters or charFilters. Fix that to allow an empty array based on feedback on open questions in scope review.
This is a known monaco json language service bug: microsoft/vscode-json-languageservice#86
There is a workaround as was originally added in 8f1bb4f originally inspired from microsoft/vscode-json-languageservice#86 (comment)
overridden analyzers are supported in index definitions although deprecated. See example use case. Confirm syntax with @navenoxin
stored source is more than boolean as can be seen in https://www.mongodb.com/docs/atlas/atlas-search/stored-source-definition/
Currently mappings.fields
is a free-form object with no restrictions. Bring in Atlas Search rules to help auto-complete and validation to this section of an index definition.
Since we are using TS, we have had to consume monaco-editor.editor
to get TS types for IMarker and editor etc. Switch to fully using Monaco Editor for React and not depending on libraries written for JS apps. This task would be considered complete when https://github.com/manoj-pillay-10gen/simple-search-index-validator/blob/main/package.json#L18 can be removed.
While monaco is working fine, ajv-validator as well as json2ts is not resolving references accurately. We will need to follow $id and $uri definition conventions more closely to ensure that refs can be resolved unambiguously by any standard validators while using non-strict mode.
We figured out that "examples": [{example1, example2...}]
will lead to templates. Start using templates at appropriate places.
There are a few fields in the index definition for which there could be both
Examples are analyzers
and searchAnalyzer
. While validation of custom values defined at run-time are not possible through monaco, it should be possible not to complain when someone uses a custom value. However, in doing so, the autocomplete options of pre-defined values should not go away.
{
mappings: {
dynamic: false
}
}
This should be an invalid index definition but we currently let it pass.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.