Git Product home page Git Product logo

json5format's Introduction

json5format

json5format is a general purpose Rust library that formats JSON5 (a.k.a., "JSON for Humans"), preserving contextual line and block comments.

crates.io license docs.rs json5format

json5format Rust library

The json5format library includes APIs to customize the document format, with style options configurable both globally (affecting the entire document) as well as tailoring specific subsets of a target JSON5 schema. (See the Rust package documentation for more details and examples.) As of version 0.2.0, public APIs allow limited support for accessing the information inside a parsed document, and for injecting or modifying comments.

formatjson5 command line tool

The json5format package also bundles an example command line tool, formatjson5, that formats JSON5 documents using a basic style with some customizations available through command line options:

$ cargo build --example formatjson5
$ ./target/debug/examples/formatjson5 --help

formatjson5 [FLAGS] [OPTIONS] [files]...

FLAGS:
-h, --help                  Prints help information
-n, --no_trailing_commas    Suppress trailing commas (otherwise added by default)
-o, --one_element_lines     Objects or arrays with a single child should collapse to a
                            single line; no trailing comma
-r, --replace               Replace (overwrite) the input file with the formatted result
-s, --sort_arrays           Sort arrays of primitive values (string, number, boolean, or
                            null) lexicographically
-V, --version               Prints version information

OPTIONS:
-i, --indent <indent>    Indent by the given number of spaces [default: 4]

ARGS:
<files>...    Files to format (use "-" for stdin)

NOTE: This is not an officially supported Google product.

json5format's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

json5format's Issues

Support multi-line end-of-line comment formatting

For example:

        someprop: "value", // This is a long end of line comment that might be broken into
                           // more than one line. json5format should make no assumptions about:
                           //   * where the line break is or
                           //   * how many spaces there are after the initial slashes.

I believe this is currently reformatted as:

        someprop: "value", // This is a long end of line comment that might be broken into

        // more than one line. json5format should make no assumptions about:
        //   * where the line break is or
        //   * how many spaces there are after the initial slashes.

But ideally, if the subsequent line comments were indented to match the slash position of the end-of-line comment on the first line of this example, the result would likely look identical to the original input (which is preferred).

Implement a streaming API, and read input lazily

Depending on the formatter configuration, some nested levels will not be streamable, but for any layer that does not reorder its elements, that layer can be streamed.

In particular, the top level ParsedDocument is represented as an outermost Array, most commonly consisting of Object-typed elements, which are typically not reordered.

In a fairly common scenario, some very large documents are large because they contain 100s or 1000s of top-level objects, but any individual object is more than likely of a much more manageable size. With streaming, we should be able to format each object as its parsed, which will be much better for large documents than the current implementation that requires reading the entire document into memory before formatting.

Only read from the input to the parser as needed (lazily) as the formatter completes formatting the previously streamed content. (This is sometimes referred to as "backpressure" provided by the formatter, to limit the flow of input from the parser.)

Panic on numeric scientific notation

Minimal example:

fn main() {
    let json_string = r#"{"hello":3.14e-8}"#;
    json5format::format(json_string, None, None).unwrap();
}

Output:

$ cargo run --release
    Finished release [optimized] target(s) in 0.02s
     Running `target/release/json5-test`
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Parse(Some(1:12), "Object values require property names:\n{\"hello\":3.14e-8}\n           ^~~~~")', src/main.rs:3:50
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Consider using nom or another parser-support library?

The current regex parsing strategy looks like it works quite well and is very well tested, but I have concerns about the efficiency and maintainability of regex-based parsers. This (really awesome!) tool seems likely to be with us for a long haul so I'd encourage the authors to look into using a crate like nom or pest for handling the actual parsing. nom has a fairly concise example of using it to parse json which I imagine could be adapted to support json5 and the additional parsing done there.

Consider integration with serde

The Rust serde (serialize/deserialize) API includes a JSON5 parser that converts the data into a Rust in-memory representation. This parser currently ignores comments. The data can be converted back into JSON, but the comments and formatting will be lost.

Consider integrating the json5format capabilities into serde by merging with and extending the current serde JSON5 parser.

Support option to normalize string quotes

Simple quoted strings with inconsistent quotes might look like:

   [
      'string1',
      "string2",
   ]

The current version (0.1.0) does not change the quote style from the original input, in case the selected quote style was intentional. For example:

   {
      nickname: 'Tracy "The OG" Morgan',
      agent: "Doug O'Dell",
      personal_trainer: "Stupid Judy of EPCOT",
   }

A possible setting would specify the desired quote style. If, for the first example, double-quote is preferred, the result would be:

   [
      "string1",
      "string2",
   ]

Perhaps the desired quote should only apply if the string has no embedded quotes of the same style as the selected quote style. Otherwise, just use the original quote style for that given value. For the second example, preferring double-quotes would result in the same mixed quote style as the original input, but preferring single-quotes would change only the last property:

   {
      nickname: 'Tracy "The OG" Morgan',
      agent: "Doug O'Dell",
      personal_trainer: 'Stupid Judy of EPCOT',
   }

Implement a JSON5 with Comments representation in JSON

I have a design document that describes how one might generate a pure JSON document that includes a loss-less representation of the JSON5 data and it's comments, including all metadata currently implemented in the json5format API to reconstitute the comments with intended relationships and formatting constraints.

This could be used to feed a JSON5 document to a JSON tool, manipulate the content, and reconstitute the JSON5 without losing any context, associations, or dependencies.

Ideally, JSON5-specific extensions for primitive value formats would be handled by converting the JSON5 string representation of the value to a JSON-native representation, and adding metadata to save the original format.

Implement more fuzzers

Some initial fuzzing is now performed in CI by oss-fuzz. It would be good to implement more fuzzers.

Improve handling errors by intelligently trimming extremely long lines, when showing the location of the error

While fixing an issue revealed by oss-fuzz, the error was related to a generated document that had 1000s of open braces on a single line.

Once the parser was able to catch the error without crashing, it generated an error message that filled my terminal scroll buffer, hiding useful debugging information.

We should be able to trim the line and still show where the syntax error was caught.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.