Continuation of <a class="issue-link js-issue-link" data-error-text="Failed to load ti

This would be a great opportunity to use <a class="user-mention notranslate" data-hove

Embedded Languages about unibeautify HOT 7 OPEN

unibeautify commented on May 30, 2024 2

Embedded Languages

from unibeautify.

Comments (7)

prettydiff commented on May 30, 2024 2

I can start working an enhancement for the parser to include a PHP library. I have increased the documentation around the parser and given it a website at https://sparser.io

I still need to do a lot of work around documentation, but the organization and framework are already in place to accomplish all the stated objectives. I just need to write rules specific for PHP. Once I do you have any combination of HTML with PHP embedded or the opposite.

Is it not possible (or recommended) to simply allow for example, an html/js beautifier to be specified for PHP?

By writing PHP specific rules everything in the document can be beautified regardless of which language is where. PHP is a big language. I am getting faster at this, but it may still take me a while (and a lot of patience and testing) to get it right.

from unibeautify.

lllopo commented on May 30, 2024 1

I've actually got it working. It's a sick way, but it works. What I did is :

Configure Unibeautify to use PHP CS Fixer to format php files
Install a separate instance of prettier with its php plugin
Configure PHP CS Fixer to use Prettier as a fixer.

It works quite ok. The drawbacks : the configuration is done through php cs fixer and prettier own config files, not through unibeatify and secondly - it is a bit slow as Prettier formats the php file by creating a secondary temp file that is then copied back over the original one ... yet - it works :-)

from unibeautify.

lassik commented on May 30, 2024

Sketch of a framework for this job

To do this properly, I believe there should be exactly one "top language" to beautify every time. The top language could be given callbacks for other languages that it can delegate parts of the code to, at its own discretion.

(Reasoning: When we start parsing a file, one language must have the authority to say what any given bit of code means - there cannot be a panel of languages each giving its own interpretation of the same code. To reliably descend to other languages, the only way to do that is to ask the authoritative language when to do it. And the straightforward way to do that is to provide some kind of callback to the parser of that language.)

For example: The top language could be PHP, with delegates for HTML, CSS and JavaScript. The PHP beautifier would round off the non-PHP parts of the code, cut them out and send them to the HTML beautifier, then paste back the results. When HTML beautifier is invoked in this way, it would in turn cut off the CSS and JavaScript parts of the HTML code it was given, and hand those off to the CSS and JavaScript beautifiers in the same way. So the process could be recursive to arbitrarily many levels. (In practice I suppose 3-4 levels, e.g. PHP -> HTML -> JavaScript, might be the maximum - it gets unwieldy to manage for humans at that point.) Also, the beautification framework would in principle support recursing into the top language again as a delegate (I don't know if this is useful, but it falls out naturally as a property of the framework design).

I think the core data structure we'd need is simply like this:

{
    "topLanguage": "PHP",
    "languages": {
        "PHP": {
            "beautifiers": [
                {
                    "name": "PHP_CodeSniffer",
                    "options": {},
                    "callback": function (options, inputCode) -> outputCode
                }
            ]
        },
        "HTML": {
            "beautifiers": [
                {
                    "name": "HTMLTidy",
                    "options": {},
                    "callback": function (options, inputCode) -> outputCode
                }
            ]
        },
        "JavaScript": {
            "beautifiers": [
                {
                    "name": "Prettier",
                    "options": {},
                    "callback": function (options, inputCode) -> outputCode
                }
            ]
        }
    }
}

To make the Unibeautify design sustainable for years and hopefully a couple decades to come, I would strongly favor an approach like this instead of using regexp hacks or other means of hard-coding some of the syntax of particular langauges such as PHP into the Unibeautify framework itself. This will be slower and harder to do initially, but the result will be more reliable as well as easier to understand and extend.

The catch here is that while the above callback mechanism is farily obvious for beautifiers implemented in JavaScript, we will also have lots of external beautifiers called via the Unix/Windows process interface. How do we pass delegate beautifiers to them? I think the natural approach would be to pass the command line of each delegate beautifier. The command line would contain all the beautifier options, and the top beautifier would set up a pipe so that the delegate beautifier can read code from stdin and write to stdout. The delegate beautifier would write error and warning messages to stderr and the top beautifier would have to merge those with its own error messages somehow. The method for passing the delegate command lines would necessarily have to be beautifier-specific, and we'd probably have to talk to the people making those beautifiers to agree on something, since I wouldn't expect any of them to yet have a mechanism like this. A natural approach would be to read JSON or similar settings from an agreed-upon file descriptor (such as fd 3, have to verify that this works also on Windows).

(EDIT: Added callbacks to data structure example.)

from unibeautify.

lassik commented on May 30, 2024

A small addition to the above: Should we run the same beautifier multiple times for the same code? Consider e.g. the following beautifier chains:

HTML:       input -> HTMLTidy -> SomeOtherHTMLBeautifier -> output
JavaScript: input -> Prettier -> JS-Beautify -> output

If the HTML contains embedded JavaScript, and the HTML is run through two beautifiers, then a naive implementation would run the JavaScript beautifier chain twice for each snippet of JavaScript. Both runs of the chain should produce identical results, so we should run the chain just once.

This problem can be remedied e.g. by keeping track of the set of all beautifiers that have already been run on any given bit of code. Before running a beautifier, we check whether it's already in the set. If it is, then we just return the code verbatim instead of beautifying it again. But are there legitimate reasons to run the same beautifier twice in the same chain? Are there non-pathological use cases for that?

And do subsequent runs of the same chain really always produce the same results? If e.g. the second HTML beautifier changes the indentation of the HTML, then that can also change the indentation of embedded JavaScript (or line breaking, in case the JavaScript has long lines of code). Maybe the naive implementation that ignores this problem is best anyway.

from unibeautify.

lassik commented on May 30, 2024

Fortuitously, the above data structure is pretty much exactly what we already have. topLanguage is just --language from the CLI, and languages is almost verbatim from .unibeautifyrc, just with the callbacks added (you'd probably use objects instead of callbacks, but the principle is the same). I didn't check the code of Unibeautify core, but you probably already have the requisite data structures and callbacks/objects more or less ready 😄

from unibeautify.

stevenzeck commented on May 30, 2024

This would be a great opportunity to use @prettydiff parse-framework. Playground: http://prettydiff.com/parse-framework/runtimes/browsertest.xhtml. I'm seeing some issues with PHP, but take an HTML file like this and see:

<!DOCTYPE HTML>
<head>
    <style>
        p {
            font-size: 12px;
        }
    </style>
    <script type="application/javascript">
        document.getElementById("hey");
    </script>
</head>
<html>
<body>
<p id="hey">Hey!</p>
</body>
</html>

In this case, markup would be handled by the HTML beautifier, style by the CSS beautifier, and script by the Javascript beautifier.

from unibeautify.

spider-mane commented on May 30, 2024

Is it not possible (or recommended) to simply allow for example, an html/js beautifier to be specified for PHP? I get the impression that the php beautifiers wont operate on anything outside of php tags and running pretty diff on a php file with integrated html doesn't seem to result in any changes to the php itself. I haven't really tested anything extensively, but seems like it could at least be safe to allow Pretty Diff to be specified as a beautifier for PHP. The only downsides I've seen to this is that at least by default, there is no indentation for use of template-like php tags such as <?php foreach () : ?><?php endforeach; ?> nor is any embedded html indented within its containing php block. Minor inconveniences, but nothing that will break the code and also preferable to the otherwise lack of options for this in editors such as vs code.

from unibeautify.

Embedded Languages about unibeautify HOT 7 OPEN

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent