Git Product home page Git Product logo

Comments (9)

Glavin001 avatar Glavin001 commented on May 30, 2024

See #8 (comment)

from unibeautify.

prettydiff avatar prettydiff commented on May 30, 2024

There are some concerns with a universal parsing framework:

  • It will need a standard API. This will be necessary for internal decisions in a way that is independent of external application preferences. Essentially the parser will have to become an application in its own right apart from beautification or formatting. I would probably do well to start with the current parsers in Pretty Diff, rip them out, and make them independent code repositories under the Unibeautify Github organization.
    • This decision will fundamentally change Pretty Diff as options management has slowly become the foundation of its architecture. If the parsers are to become independent applications with a separate API then Pretty Diff will have to change to accept this. This has gotten me thinking about a new architecture for Pretty Diff: prettydiff/prettydiff#462
    • Once I get all this accomplished I will do the research to add PHP support to the parsers. This won't be any kind of wild stretch since PHP uses C based syntax just like JavaScript and must recursively allow intermixing of markup and PHP tags in a manner similar to JSX
    • Standards demand discussion. I still start with what I have and make it available simultaneously while working Pretty Diff 3. It will demand input and contribution to get this right otherwise I will be a dictator and do whatever I want just to ensure we have code that ships.
  • From what little I know of Prettier it does not have a parser of its own. It is reliant upon Flux and/or Babel, which are both Facebook projects. What I have seen of how James Kyle describes parsing trees it seems his idea of parsing structures are similar to what I have in place for the Pretty Diff parsers so might not be much effort to also get Prettier on board to the concept of a universal parser, especially since everything here is about beautification.
  • I don't have any timeline for this work. It seems ambitious to me. Unfortunately, I am not eligible for military deployments until next year otherwise I would have a large block of dedicated time I could focus on this.

from unibeautify.

prettydiff avatar prettydiff commented on May 30, 2024

This is just an update to keep everybody informed of the current status.

Current state

  • The parse framework isn't ready for use in production but is stable for experimentation and everything indicating the Pretty Diff brand is removed except for one exception.

  • I have not created a test suite and it is likely incredibly buggy considering how much refactoring has gone into this. I recently started a new job and cannot push to github from work, but once I get home and push the latest update this will be ready to play with.

  • I have created a browser interface and a node interface to make experimenting with the tool fast and easy. The output is color coded based upon the lexer that processed a given token, which is handy when looking at deeply nested JSX.

  • The one exception about the Pretty Diff brand is that the markup lexer had a feature to allow users to arbitrarily skip beautification on designated tags by placing the attribute data-prettydiff-ignore="false" (the value is irrelevant). For backwards compatibility I have left this in the code but changed the supported feature to look for the attribute data-parse-ignore. The later is the only one that will show up in documentation.

What is going on under the hood

I never realized just much cheating I did to this work in Pretty Diff for the purpose of beautification instead of an abstract parser. For instance when I had to move from one lexer to another I would would format the given block of code completely independently and return a fully beautified string from the later lexer to the former. That won't work in a parsing framework isolated from all beautifiers, so I had to completely revise how I structured things and make the parser more framework-ish. That wasn't the only such mega structural fail I encountered either.

So this parser is structured differently (more uniformly) than what is used in Pretty Diff. As I revise and restructure things I am also working on stylistic changes to make the code easier to read or understand. I have gotten quite far on with this markup logic, but it still to need to analyze the script and style code to make the code more understandable. I would like for this to be more of a community project, but the code must be a bit less intimidating.

Please feel free to nitpick anything. I will be writing some contribution guidance soon once I get a bit further to better explain how this application behaves more like a framework.

Known megafails

  • I have retained object sorting and tag sorting functionality as the parser seems like the ideal location for rearranging syntax tokens. I suspect this sorting is properly updating the begin and stack fields of data as it is moved around and there are likely other bugs.
  • Once I feel confident that the two sort operations are doing their jobs properly I will build out a test suite from the prior Pretty Diff unit tests. I suspect I will find lots of defects that this stage that may introduce more refactoring.

from unibeautify.

prettydiff avatar prettydiff commented on May 30, 2024

Update - https://github.com/Unibeautify/parse-framework/tree/tests

Good stuff

  • Framework is complete and documented
  • In refactoring the code I have found what made the JavaScript parser slow. Now it is tied with the latest version of Esprima for speed making it tied for being the fastest JavaScript parser written in JavaScript (on my Windows box).

Bad stuff

  • Still not ready. I am still occasionally finding things to fix.
  • I am putting together unit tests right now.... this is excruciatingly painful. In Pretty Diff the unit tests output beautified code which is easy to scan and read, but the parse framework outputs a parse table with additional data. I need to find a way to speed this up or provide some additional automation to validate the parse tables generated from code samples is accurate as this is taking too long.

Plans (in this order after unit tests are populated)

  1. Provide lexer specific documentation
  2. Convert var to let and const
  3. Convert the code to TypeScript
  4. Bring in a markdown lexer. I wrote a markdown parser for biddle (so half this work is already done)
  5. Add reporting of parse errors and strong error reporting

from unibeautify.

prettydiff avatar prettydiff commented on May 30, 2024

The parse-framework is ported to TypeScript. I can put it out in NPM or whatever now. There are still some things I want to do, but now it is ready for consumption and evaluation.

from unibeautify.

bitwiseman avatar bitwiseman commented on May 30, 2024

@prettydiff
You have my respect for all the work you've done here.

That said, I'm having trouble understanding the framework. The decision to use an object with with with multiple "columns" of data makes absolutely no sense to me.

var data = {
                begin: [],
                lexer: [],
                lines: [],
                presv: [],
                stack: [],
                token: [],
                types: []
            };

This structure means that all "rows" must have values for each of the columns (otherwise the indexing of that column would be off). Could you put a bit more in the readme as to why this is better than using an array of objects?

var data = [];

data.push({begin: 0, lexer: "markup", lines: 0, presv: false, stack: 'global, token: '<a>', type: 'start'})

The names are also confusing. "lines" is a flag for values from 0 to 3 that indicates whitespace and newline presence. From what I can tell, that means that the standard format does support round tripping of parsed elements - there's not place to store what whitespace, which newline characters, or how many newline characters were present. Am I understanding that correctly?

Thanks!

from unibeautify.

prettydiff avatar prettydiff commented on May 30, 2024

Could you put a bit more in the readme as to why this is better than using an array of objects?

Absolutely! I will include some simple use cases.

there's not place to store what whitespace, which newline characters, or how many newline characters were present.

I probably wasn't very clear about this. I will clarify in the readme. Basically, this:

  • 0 - No white space precedes the current token. In the example of (y > 1), the y would have a lines value of 0.
  • 1 - Some white space precedes the current token, but that white space does not contain a newline. In the example of (y > 1), the > would have a value of 1.
  • 2 - White space containing a single newline precedes the current token, such as a token that is on the next line of code.
  • 3 - White space containing two newlines (an empty line or a line that appears empty and contains only white space) precede the current token.
  • x - White space containing x newlines (x - 2 empty lines) precede the current token. If there are 5 empty lines or 5 lines contain only white space before the current token it would receive a lines value of 7.

Also, I completely designed this application from the perspective of Pretty Diff. If there are missing conventions that would benefit JS Beautify at parse time (not strictly beautification related) please let me know and I will add them in as an enhancement.

from unibeautify.

prettydiff avatar prettydiff commented on May 30, 2024

Could you put a bit more in the readme as to why this is better than using an array of objects?

I will do you one better and provide a means to optionally achieve output in that format:
Unibeautify/sparser#30

from unibeautify.

Glavin001 avatar Glavin001 commented on May 30, 2024

Closing this. Discussion can continue at https://github.com/Unibeautify/parse-framework

from unibeautify.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.