google / json5format Goto Github PK

View Code? Open in Web Editor NEW

92.0 4.0 19.0 131 KB

JSON5 (a.k.a., "JSON for Humans") formatter that preserves contextual comments

Home Page: https://crates.io/crates/json5format

License: BSD 3-Clause "New" or "Revised" License

Rust 100.00%

json5 rust rust-library

json5format's Introduction

json5format

json5format is a general purpose Rust library that formats JSON5 (a.k.a., "JSON for Humans"), preserving contextual line and block comments.

`json5format` Rust library

The json5format library includes APIs to customize the document format, with style options configurable both globally (affecting the entire document) as well as tailoring specific subsets of a target JSON5 schema. (See the Rust package documentation for more details and examples.) As of version 0.2.0, public APIs allow limited support for accessing the information inside a parsed document, and for injecting or modifying comments.

`formatjson5` command line tool

The json5format package also bundles an example command line tool, formatjson5, that formats JSON5 documents using a basic style with some customizations available through command line options:

$ cargo build --example formatjson5
$ ./target/debug/examples/formatjson5 --help

formatjson5 [FLAGS] [OPTIONS] [files]...

FLAGS:
-h, --help                  Prints help information
-n, --no_trailing_commas    Suppress trailing commas (otherwise added by default)
-o, --one_element_lines     Objects or arrays with a single child should collapse to a
                            single line; no trailing comma
-r, --replace               Replace (overwrite) the input file with the formatted result
-s, --sort_arrays           Sort arrays of primitive values (string, number, boolean, or
                            null) lexicographically
-V, --version               Prints version information

OPTIONS:
-i, --indent <indent>    Indent by the given number of spaces [default: 4]

ARGS:
<files>...    Files to format (use "-" for stdin)

NOTE: This is not an officially supported Google product.

json5format's People

Contributors

Stargazers

Watchers

Forkers

neotim richkadel global-localhost global19 global19-atlassian-net davidatp davidkorczynski isabella232 shabbirhasan1 atodorov makesoftwaresafe mayhemheroes sarvex berling ghas-results boweiliu bschwind skywhale

json5format's Issues

Panic on numeric scientific notation

Minimal example:

fn main() {
    let json_string = r#"{"hello":3.14e-8}"#;
    json5format::format(json_string, None, None).unwrap();
}

Output:

$ cargo run --release
    Finished release [optimized] target(s) in 0.02s
     Running `target/release/json5-test`
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Parse(Some(1:12), "Object values require property names:\n{\"hello\":3.14e-8}\n           ^~~~~")', src/main.rs:3:50
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Make the example program the official binary/program of the crate

Basically, enable cargo install json5format to build an executable that can be used in the CLI. There seems to be a pretty well done implementation as the example, so perhaps this isn't too big of a change?

OSS-Fuzz issue 63220

OSS-Fuzz has found a bug in this project. Please see https://oss-fuzz.com/testcase?key=5798178911551488 for details and reproducers.

This issue is mirrored from https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63220 and will auto-close if the status changes there.

If you have trouble accessing this report, please file an issue at https://github.com/google/oss-fuzz/issues/new.

OSS-Fuzz issue 45360

OSS-Fuzz has found a bug in this project. Please see https://oss-fuzz.com/testcase?key=6378075993014272 for details and reproducers.

This issue is mirrored from https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=45360 and will auto-close if the status changes there.

If you have trouble accessing this report, please file an issue at https://github.com/google/oss-fuzz/issues/new.

OSS-Fuzz issue 49666

OSS-Fuzz has found a bug in this project. Please see https://oss-fuzz.com/testcase?key=6258443249385472 for details and reproducers.

This issue is mirrored from https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=49666 and will auto-close if the status changes there.

If you have trouble accessing this report, please file an issue at https://github.com/google/oss-fuzz/issues/new.

If a document ends with an unclosed block comment, the parser will crash

While fixing another problem revealed by an oss-fuzz test, the given sample also revealed a problem when parsing a document that has an open block comment that is never closed. Instead of crashing, this should be a parser error.

Address assertion failed: self.scope_stack.len() > 0, revealed by oss-fuzz testing

Some malformed documents can cause the parser to fail an assertion. This should be caught and produce a parser error, showing the syntax error in the JSON5 document. See for example:

https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=42395

Address out-of-memory issue revealed in some oss-fuzz tests

At least one oss-fuzz test resulted in an out-of-memory crash. See for example:

https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=32013

Address frequent oss-fuzz error: byte index <n> is not a char boundary; it is inside

json5format is intended to support JSON5 documents with UTF-8 encoding, but oss-fuzz tests indicate there is a problem with unicode characters in some situations. See for example:

https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=42835

Implement more fuzzers

Some initial fuzzing is now performed in CI by oss-fuzz. It would be good to implement more fuzzers.

Consider integration with serde

The Rust serde (serialize/deserialize) API includes a JSON5 parser that converts the data into a Rust in-memory representation. This parser currently ignores comments. The data can be converted back into JSON, but the comments and formatting will be lost.

Consider integrating the json5format capabilities into serde by merging with and extending the current serde JSON5 parser.

Implement a streaming API, and read input lazily

Depending on the formatter configuration, some nested levels will not be streamable, but for any layer that does not reorder its elements, that layer can be streamed.

In particular, the top level ParsedDocument is represented as an outermost Array, most commonly consisting of Object-typed elements, which are typically not reordered.

In a fairly common scenario, some very large documents are large because they contain 100s or 1000s of top-level objects, but any individual object is more than likely of a much more manageable size. With streaming, we should be able to format each object as its parsed, which will be much better for large documents than the current implementation that requires reading the entire document into memory before formatting.

Only read from the input to the parser as needed (lazily) as the formatter completes formatting the previously streamed content. (This is sometimes referred to as "backpressure" provided by the formatter, to limit the flow of input from the parser.)

Support multi-line end-of-line comment formatting

For example:

        someprop: "value", // This is a long end of line comment that might be broken into
                           // more than one line. json5format should make no assumptions about:
                           //   * where the line break is or
                           //   * how many spaces there are after the initial slashes.

I believe this is currently reformatted as:

        someprop: "value", // This is a long end of line comment that might be broken into

        // more than one line. json5format should make no assumptions about:
        //   * where the line break is or
        //   * how many spaces there are after the initial slashes.

But ideally, if the subsequent line comments were indented to match the slash position of the end-of-line comment on the first line of this example, the result would likely look identical to the original input (which is preferred).

Consider using nom or another parser-support library?

The current regex parsing strategy looks like it works quite well and is very well tested, but I have concerns about the efficiency and maintainability of regex-based parsers. This (really awesome!) tool seems likely to be with us for a long haul so I'd encourage the authors to look into using a crate like nom or pest for handling the actual parsing. nom has a fairly concise example of using it to parse json which I imagine could be adapted to support json5 and the additional parsing done there.

OSS-Fuzz issue 43787

OSS-Fuzz has found a bug in this project. Please see https://oss-fuzz.com/testcase?key=6612345919504384 for details and reproducers.

This issue is mirrored from https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=43787 and will auto-close if the status changes there.

If you have trouble accessing this report, please file an issue at https://github.com/google/oss-fuzz/issues/new.

parser crashes if a block comment includes a blank line

This should not make the parser crash:

        /*
          what happens

          with this 5
        */
        /*

          what happens

          with this 6
        */

Bug: formatjson5 CLI tool: removes quotation marks

Input:

{
    "folders": [
        {
            "path": "..",
        },
    ],
}

Output:

{
    folders: [
        {
            path: "..",
        },
    ],
}

IMHO the output is not a valid json5

Improve handling errors by intelligently trimming extremely long lines, when showing the location of the error

While fixing an issue revealed by oss-fuzz, the error was related to a generated document that had 1000s of open braces on a single line.

Once the parser was able to catch the error without crashing, it generated an error message that filled my terminal scroll buffer, hiding useful debugging information.

We should be able to trim the line and still show where the syntax error was caught.

Packaged and available in Arch Linux's AUR

As a heads up, I've created a package for formatjson5 and is now available in the AUR.

https://aur.archlinux.org/packages/formatjson5

(Related to #40 as that would make packaging this even simpler, not that it's particularly complicated.)

Address stack overflow error revealed by oss-fuzz if there are too many unclosed braces

In the unlikely event that a document has thousands of open braces without closing braces, the program can crash with a stack overflow. See for example:

https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=34603

Support option to normalize string quotes

Simple quoted strings with inconsistent quotes might look like:

   [
      'string1',
      "string2",
   ]

The current version (0.1.0) does not change the quote style from the original input, in case the selected quote style was intentional. For example:

   {
      nickname: 'Tracy "The OG" Morgan',
      agent: "Doug O'Dell",
      personal_trainer: "Stupid Judy of EPCOT",
   }

A possible setting would specify the desired quote style. If, for the first example, double-quote is preferred, the result would be:

   [
      "string1",
      "string2",
   ]

Perhaps the desired quote should only apply if the string has no embedded quotes of the same style as the selected quote style. Otherwise, just use the original quote style for that given value. For the second example, preferring double-quotes would result in the same mixed quote style as the original input, but preferring single-quotes would change only the last property:

   {
      nickname: 'Tracy "The OG" Morgan',
      agent: "Doug O'Dell",
      personal_trainer: 'Stupid Judy of EPCOT',
   }

Implement a JSON5 with Comments representation in JSON

I have a design document that describes how one might generate a pure JSON document that includes a loss-less representation of the JSON5 data and it's comments, including all metadata currently implemented in the json5format API to reconstitute the comments with intended relationships and formatting constraints.

This could be used to feed a JSON5 document to a JSON tool, manipulate the content, and reconstitute the JSON5 without losing any context, associations, or dependencies.

Ideally, JSON5-specific extensions for primitive value formats would be handled by converting the JSON5 string representation of the value to a JSON-native representation, and adding metadata to save the original format.

Move anyhow to dev-dependencies?

~~For the library crate at least it would be cool if it exported a type that implements std::error::Error without type erasure (thiserror is a nice way to do this). Context:~~

~~`https://github.com/dtolnay/anyhow#comparison-to-thiserror~~

EDIT: It looks like anyhow is only used in doctests and examples. Maybe it could be moved in the Cargo.toml to make that clearer?

OSS-Fuzz issue 43706

OSS-Fuzz has found a bug in this project. Please see https://oss-fuzz.com/testcase?key=5734884822351872 for details and reproducers.

This issue is mirrored from https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=43706 and will auto-close if the status changes there.

If you have trouble accessing this report, please file an issue at https://github.com/google/oss-fuzz/issues/new.

Add option to format objects on a single line if they fit

Fuchsia example where the diff was largely going from manually formatted to auto-formatted using this, and previous single-line declarations expanded: https://fuchsia-review.googlesource.com/c/fuchsia/+/376278/14..16/src/sys/bootstrap/meta/fshost.cml

google / json5format Goto Github PK

json5format's Introduction

json5format

json5format Rust library

formatjson5 command line tool

json5format's People

Contributors

Stargazers

Watchers

Forkers

json5format's Issues

Recommend Projects

Recommend Topics

Recommend Org

`json5format` Rust library

`formatjson5` command line tool