Git Product home page Git Product logo

simple-search-index-validator's People

Contributors

manoj-pillay-10gen avatar

Watchers

 avatar

simple-search-index-validator's Issues

Reorganize schema files hierarchically

As JSON in hierarchical in nature, we should take advantage of this to organize the schema files hierarchically using sub-directories.

For example,

  • index
    • mappings
    • synonyms
    • tokenFilters
      • asciiFolding
      • edgeGram
      • ...
      • ...

This is currently not possible because in the following library function to load all schemas, the implementation only supports reading files at the current level and not nested levels of directories.

export function getAllSchemaIds() {
const filenames = fs.readdirSync(schemaDir);
return filenames.map((filename) => {
return {
params: {
id: filename.replace(/\.json$/, ""),
},
};
});
}
export async function getSchemaData(id: string) {
const fullPath = path.join(schemaDir, `${id}.json`);
const contentJson = fs.readFileSync(fullPath, "utf-8");
return {
id,
contentJson,
};
}

Tasks are to:

  • rewrite schema.tsx
  • reorganize schema files into subdirectories
  • modify schema to reflect change in schema paths due to reorganization.

Validation for interdependent properties

An example of very commonly seen business logic for property value validation is based on dependencies of one property on another. For example, here is the definition for an edgeGram tokenizer

      "tokenizer": {
        "type": "edgeGram",
        "minGram": 5,
        "maxGram": 10
      }

An expectation is that minGram <= maxGram. However, this rule is currently not possible to install directly using the schema, because there is no way to do this in JSON schema specification

We will need to install this rule during the validation phase (no squiggly red lines) of the index before it gets submitted to mongot unless we resort to dynamic injection (doesn't sound worthwhile for this).

Dynamically inject values defined at run-time into dropdown options for `analyzers` and `searchAnalyzer`

A challenging requirement but the summary is to let people define a custom analyzer in the editor, and then have that show up as a dropdown option when autocompleting for fields such as:

  • analyzer
  • searchAnalyzer
  • synonyms[i].analyzer etc.

By resolving #8 , we have solved the problem of the name of the custom analyzer defined at run time being legal for use as the value for analyzer or other fields listed above. However, since JSON specification does not let reference to values of other properties, we are unable to provide a custom analyzer name defined at run-time show up as a dropdown option.

This will need to be dynamically injected in typescript as the schema is being defined.

Extract and add known index definitions from mongot

mongot is the source of truth for index definitions. They have a curated list of index definitions that we could use as tests on our schema. To start with import them and subject them to validation as part of the existing pre-commit hook.

Reuse index.json in fullIndex.json

Currently, we have to duplicate code between index.json and fullIndex.json . Instead. there may be a way to extend siblings of a schema to eliminate duplication and drift caused from maintaining redundant copies.

fullIndex.json is more than ornamental, because the API which needs the auto-generate YAML spec requires the additional components available in fullIndex.json.

Define git submodule to make schema files in MMS the single source of truth

It will soon be painful to maintain the consistency between schema files defined in MMS and simple-search-index-validator. A single source of truth is essential in order to make development easy. One approach may be to define the schema directory in MMS as a sub module of the simple search validator.

Caveat: Certain parts of the schema definition are diverging already. For example : index.json does not contain database or collection names in MMS but has those fields in simple search validator. We will need to find a way to have a divergent section and a convergent section for each submodule i.e there should be a way to define index.json for simple-search-validator that overrides the base copy available in MMS that will be original source of truth.

Eliminate *.invalid files

There are many *.invalid files added such as in https://github.com/manoj-pillay-10gen/simple-search-index-validator/tree/4ca4669f4a15a40c63855aaa533ac1c198122e53/data/sampleData/JSONEditorSamples/autocomplete . These are files that are valid according to mongot syntax definitions but invalid according to our schema definition.

Two paths for resolution:

Path 1

  • Modify schema to meet mongot syntax
  • Run for f in $(find . -type f -maxdepth 1 -name "*.invalid"); do mv $f `basename $f .invalid`; done in each of the directories with invalid files

Path 2

  • Keep schema intact after confirming with query team.
  • Modify/delete *.invalid files to pass validation.

Supporting mappings.fields in schema

Currently mappings.fields is a free-form object with no restrictions. Bring in Atlas Search rules to help auto-complete and validation to this section of an index definition.

Resolve refs correctly.

While monaco is working fine, ajv-validator as well as json2ts is not resolving references accurately. We will need to follow $id and $uri definition conventions more closely to ensure that refs can be resolved unambiguously by any standard validators while using non-strict mode.

Allow custom values for analyzer, searchAnalyzer etc.

There are a few fields in the index definition for which there could be both

  • predefined values
  • custom values

Examples are analyzers and searchAnalyzer. While validation of custom values defined at run-time are not possible through monaco, it should be possible not to complain when someone uses a custom value. However, in doing so, the autocomplete options of pre-defined values should not go away.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.