ethan-leba / tree-edit Goto Github PK

View Code? Open in Web Editor NEW

381.0 12.0 15.0 1.28 MB

🌲 Structural editing in Emacs for any™ language!

License: GNU General Public License v3.0

Emacs Lisp 99.91% Makefile 0.09%

emacs tree-sitter evil-mode

tree-edit's Introduction

I’m looking for work! Check out my LinkedIn or email me at [email protected].

⚠ Tree-edit is very much a work-in-progress. Expect to run into bugs and breaking changes!

Every programming language has a formally defined structure, but most text editors are completely ignorant to it. As a result, editing can oftentimes devolve into a tedious exercise in character manipulation.

Tree-edit provides language-agnostic editing operations that map directly to the structure of the language, abstracting away the process of manually entering syntax. Leveraging the tree-sitter parser, tree-edit always has access to the precise state of the syntax tree – and directly wields the grammars of the languages under edit to power it’s editing capabilities.

Overview

The repository contains two co-existing packages (that will eventually be split).

tree-edit: The core library for structural editing. This library is intended to be used by other elispers who would like to implement their own structural editing or refactoring commands.
evil-tree-edit: An evil state for structural editing with preconfigured bindings and visualization, as seen in the GIF.

To get an overview of tree-edit’s capabilities, check out the EmacsConf talk!

How does it work?

Tree-edit relies heavily on the tree-sitter parser, leveraging the JSON intermediate representation that tree-sitter outputs to have a full understanding of what is valid for a given language with no language specific efforts on tree-edit’s part.

To learn more about how tree-edit works under the hood, see this high-level overview or check out this org doc with executable code examples demonstrating how the syntax generation works.

Supported languages

Status	Language
✅	Python (issue)
🔨	C (issue)
🔨	Java (issue)

See links for grammar repository and issue tracker respectively.

✅	Supported
🔨	Under development

Tree-edit is designed to be as language-agnostic as possible. Currently the list of supported languages is not very impressive, but in theory it should be as simple as running a script to preprocess a grammar and adding a configuration file for the language. In practice the grammars usually also need modifications in order to make the grammar ergonomic for structural modification.

See here to learn the process for adding a new language.

Custom grammars

Tree-edit uses forked version of tree-sitter grammars to power it’s editing. They are intended to work as a drop-in replacement for the standard grammar, but with tweaks to better work with tree-edit. See below for how install the forked grammars.

The tree-sitter API and grammars were not designed with the structural editing usecase in mind, so most grammars are structured in a way that makes navigation and editing in tree-edit awkward or impossible without complex and fragile hackarounds. For more context, see this GH issue: tree-sitter/tree-sitter#1558

I hope that in the future more thought will be given to this usecase in terms of the tree-sitter API and grammar design so that the forks will eventually become unnecessary, but for now they’re needed.

Installing custom grammars

The function tree-edit-install-grammars-wizard can be used interactively to install grammars.

Contributing

Contributions are very much welcome! In particular, adding language files would be a great place to help. Otherwise, the issues are a good place to propose features or find ones to implement.

In addition, reporting bugs and providing feedback on the overall design and UX of the package is much appreciated! Providing a good UX for structural editing is crucial and will become increasingly important to this package as more of the fundamental shortcomings get ironed out.

The project is fairly complex and the documentation is still in progress, so feel free to open a discussion if you’re interested in helping out but you’re not sure where to start!

Running tests

The tests can be run with make test, while cached grammars can be cleaned out with make clean.

Related projects

symex: Structural navigation and editing with backends for lisp and tree-sitter
combobulate: Structural navigation and limited structural editing
grammatical-edit: Smartparens-like using tree-sitter (?)
evil-textobj-tree-sitter: Evil mode text objects using tree-sitter queries.
lispy: Lisp structural editing package – big inspiration for tree-edit!
smartparens: Multilingual package with structural editing limited to matching delimiters.

tree-edit's People

Contributors

Stargazers

Watchers

Forkers

emacsmirror jeetelongname fgeller sjb cyberflamego d4ncer pernissila plutnom zkry luluman bddean olavfosse qikahh countvajhula vigilancetech-com

tree-edit's Issues

Display the currently selected node's type when it has a parent or child with the same range

This issue is particularly bad in Python, where a block is the same size as an expression statement is the same size as a call and so are indistinguishable with the region overlay. Tree-edit should detect this case and display the current node type.

Tree-edit node usability

Node keybindings should be named
#29
Nodes should allow grammar overrides

`tree-view-mode` hardcoded to work only with `evil-tree-edit`

Hi! I recently tried to run the code, and got prompted by this message:
tree-edit-view--setup: Sorry, I hardcoded this mode to only work with evil-tree-edit. Please remind me if I forgot!

I understand that this is a work in progress and I do not expect this to be fixed soon (or at all, as you owe nothing to us), but I wanted to ask if the code may be ready for use without evil-tree-edit and you just forgot about the prompt!

In any case, I thought that it may be useful to keep this github issue open until that changes.

Thank you!

Highlight any `ERROR` nodes in the buffer

It should be obvious when ERROR nodes are present in a buffer so users aren't surprised that editing operations don't work.

support golang

hey! great presentation at emacsconf - thanks for pouring time into this project :)
i'd love to try it for my daily work, but i use mainly golang. i'd like to add support but it doesn't seem as straight-forward. would you like me to collect questions here or do you prefer another channel?

Explicit error when an `ERROR` node is present during editing operations

(Elisp)-tree-sitter represents sections of the code it cannot parse as ERRORsymbol tokens. Currently the parser will fail as if a transformation is invalid if any ERROR tokens are present. Instead, we should check if the tokens in an operation contain ERROR, and alert the user that their code is malformed if so.

Discussion on providing a non-evil editing mode for tree-edit

New 'layered' package model I'm considering:

tree-edit-core: Core structural editing logic, what tree-edit.el is currently. To be used by refactoring packages or those who aren't sold on modal editing.
tree-edit: Movement layer on top of tree-edit-core, wrapped interactive editing commands which operate with a buffer-global current node, and other functions which are not strictly necessary for usage as a library. This would be extracted from evil-tree-edit. To be used by those who want an opinionated set of editing commands and are bought into modalism, but not the evil ecosystem. Same dependencies as tree-edit-core
evil-tree-edit: Defines evil state, specific keybindings, etc., with dependencies on evil and avy. For the evil among us... >:)

Perform (some) tests via. AST equality instead of buffer text

Unless we're testing formatting, we shouldn't be performing assertions about the text of the specific buffers, but their ASTs.

Python support

Due to Python's whitespace-based parsing, there are oftentimes nodes with identical boundaries. For example, {foo;} in Java would simply be foo in Python, but the structure still implicitly exists. The node in the hierarchy affects what editing operations are valid, so that's something to keenly be aware of working in Python.

TODO:

Complete tree-edit-syntax-snippets for all nodes (d151e7f)
Add keybindings in tree-edit-nodes for all nodes (d151e7f)
Disallow empty blocks (ab07ec5)
Fix tree-edit--parse-fragment parsing expressions as expression statements (deaac61)

Add tree-edit error type

The codebase uses user-error, but we should probably make a tree-edit-error in case anyone wants to catch it.

Jumping to nodes with the same start range

Currently attempting to avy jump to a call in Python in the following scenario will only yield one choice, due to the node layout:

[[foo()].baz()]

This is because the avy jump is based on the start point of node range. How can we address this?

Switch to the end point of the node.

This could work, but has two issues:

When do we decide if we should use the end node? If the no. unique start points < no. unique end points?
Behavior needs to not be surprising to users. Could be mitigated with a (message "Start points are not unique, ...")

Define test-specific language files

Currently the tests use the 'production' language configurations, which means the tests will need to be adjusted if we change the whitespace formatting, etc. Instead we should have test language configurations.

Query-specific bindings

Query bindings should be able to jump to abstract nodes (e.g. any binary expression). I propose abstract bindings should exist on the double keys, so bb would select any binary expression.

Create tree-edit-langs repo

This repo would mirror tree-sitter-langs, parsing the grammar JSON files for usage. Not sure whether language customization would better fit in this repo, or there.

Hook into `tree-sitter-debug-mode`

It would be cool and useful to integrate with tree-sitter-debug-mode to see exactly where the currently selected node exists in the AST.

Evil & keybinding decoupling

The tree-edit API isn't specific to evil, so it would make sense to split that out. There is a global current node, though -- do we need to drop that as well? There also should be a way to disable the default keybinding if folks would like to set up their own.

Parsing fragments that are only valid in specific contexts

There are many node types that are only valid in a specific context. For example, a dictionary pair (key : value) is only valid within a dictionary node. Sometimes the error recovery can handle these cases, but in other cases it can't.

Possible solutions

Use the outer node when parsing

We can try to somehow use the syntax snippets in order to create the proper context, i.e. put the pair in a {} if the outer node is a dictionary.

Custom parser

We can define a parser where all node types are valid at the top level. This would require a custom parser, which is not viable currently, but may be down the line.

Temporary solution: store the type of a fragment on copy

This would only work for nodes copied with a tree-edit function, but would be an easier temporary fix. Would be a good optimization as well

Automated use of precedence nodes

Tree-edit should be smart about the precedence of nodes, for example in a C-like language: Inserting an + into x * [y] should output x * (y + z), as due to the precedence rules x * y + z would modify the AST of y to place it under a new parent, or inserting * in [x + y] should produce (x + y) * z.

C support

TODO:

Complete tree-edit-syntax-snippets for all nodes
Add keybindings in tree-edit-nodes for all nodes
Address disabled tests in tests/test-c.el

Indicate when an unnamed node is selected?

Performance issues

There's a noticable delay once we hit ~10 statements in a block:

    private void foo(String args) {
        break;
        break;
        break;
        break;
        break;
        break;
        break;
        break;
        break;
        break;
    }

The easiest solution would likely be to special case specific node types, like REPEAT[CHOICE...] or other trivial grammar nodes.

A more likely better long-term solution would be to port the performance optimizations in faster-miniKanren to reazon.

Special case block nodes for performance

Blocks provide the worst performance with the relational parser due to the amount of statements, and receive no benefit from it.

v1.0 roadmap

Editing features

Copy
Should we be storing the node type on copy, or inferring it on usage? If we store type on copy, where does that live?
Paste
New text-object/type should be added, so node in the clipboard can be replace, add, wrap, etc. If inferring on usage, each language could need to provide some wrapper string to parse correctly, i.e. 3 is not a valid C expression.
#23
Copy current node, replace with provided, avy select a node with current node's type, paste and remove from clipboard. It should fail if there are no nodes to replace.
Undo
Just needs to update the overlay afterwards.
Change
If the current node is one that can determined by a regex, delete node and drop into insert mode. On exit of insert mode, re-enter tree mode with same node selected.
#41
http://oremacs.com/lispy/#lispy-move-down
#44
http://oremacs.com/lispy/#lispy-raise-some
Per-mode customization of formatting and indentation of inserted nodes

Language support

My thoughts are that at least 3 languages should be supported before the API is locked in, as it's likely that there's been some bias in the API.

Leaning towards Java, Python, and Bash, as that's what I'm familiar with.

Documentation

Fully flesh out README documentation
Add some SVGs?
#28

Fix movement function names

Optimizing relational parser by producing node-specific relations

Currently the relational parser accepts a nested grammar alist and recursively pattern matches on it in order to run. This is likely creating a moderate performance hit (need to measure this).

Instead it would be ideal if we could create a relation by initially feeding in the grammar alist, and remove the pattern matching afterwards completely; essentially moving the computation to 'compile time'.

Preserve surrounding formatting during editing operations

tree-edit has configurable rules on whitespace formatting (see tree-edit-whitespace-rules), but it currently does not respect any pre-existing formatting that may be in place.

Support non-contiguous edits (for deletions)

Currently the tree-edit syntax generation and rendering relies on the
assumption that all modifications of the syntax tree would operate on a
contiguous region (adjacent siblings).

For example, if we have ("(" identifier "," comment [identifier]) and perform
a deletion, we'd like the result to be ("(" identifier comment), and the
comment should be left unmodified.

Related to #18

Inlined rules

Need to deal with inlining rules during parsing: https://tree-sitter.github.io/tree-sitter/creating-parsers#hiding-rules

Ensure all tree-edit functions that evil-tree-edit uses are public

Node keybindings should have named prefixes

Example: https://github.com/ethan-leba/tree-edit/blob/main/tree-edit-java.el#L95

These set of bindings should fall under an operator prefix map.

Wrap command

Copy current node, replace with provided, avy select a node with current node's type, paste and remove from clipboard. It should fail if there are no nodes to replace.

Changing node type while preserving contents

Let's say we want to turn this expression String input = null; into input = null; by deleting String. This seems reasonable, but would take us at the syntax-tree level from a local_variable_declaration to an (expression_statement (assignment_expression)). How can we support this?

Inline nodes not treated as supertypes

This is because inline nodes are not included in node-types.json. node-types should be dropped, and if '(,subtype) parses for a given type, then it's a supertype of subtype.

Add raise-some editing operation

http://oremacs.com/lispy/#lispy-raise-some

Validate `tree-edit-syntax-snippets`

Tree-edit should validate at some point (during CI?) if the token sequences
defined for each type in tree-edit-semantic-snippets is actually a valid construction.

Add `evil-tree-edit-clone` function

This should be fairly simple to implement, just take the text of the given node and insert it to it's left (or right) via tree-edit-insert-sibling.

Comments completely destroy the parser

!!!

Avy-jump sometimes exits to normal mode

Configurable whitespace formatting

`tree-edit-view-mode` very slow on large buffers

Limit to the nearest sig. block? Or to grandparent?

`evil-tree-edit` operator state?

Using helpful-key on edit operations won't provide a named function, since the
edits are defined as a lambda. This should be fixed. Maybe a tree-operator
state? Open to other (easier) ideas.

Add move up/down editing operation

http://oremacs.com/lispy/#lispy-move-down

Proper error message if `tree-edit-grammar` isn't loaded

This should just be a pre-flight check before one of the parser operations (tree-edit--replacement-p, insertions, deletions) that throws an error if tree-edit-grammar is nil.

Overlay sometimes doesn't become invisible on leaving tree-state

Add evil-tree-edit-tutor

File explaining the usage of evil-tree-edit.

See http://www2.geog.ucl.ac.uk/~plewis/teaching/unix/vimtutor and https://github.com/abo-abo/lispy/blob/master/lispytutor/lispytutor.el

Java support

Frankly I don't use Java enough to really care about supporting it -- it just seemed like an easy grammar to start with. So help wanted!

TODO:

Complete tree-edit-syntax-snippets for all nodes
Add keybindings in tree-edit-nodes for all nodes
Add sane whitespacing rules

ethan-leba / tree-edit Goto Github PK

tree-edit's Introduction

Overview

How does it work?

Supported languages

Custom grammars

Installing custom grammars

Contributing

Running tests

Related projects

tree-edit's People

Contributors

Stargazers

Watchers

Forkers

tree-edit's Issues

TODO:

Switch to the end point of the node.

Possible solutions

TODO:

Editing features

Language support

Documentation

TODO:

Recommend Projects

Recommend Topics

Recommend Org