Git Product home page Git Product logo

Comments (6)

bjansen avatar bjansen commented on June 3, 2024

Thanks for reporting the problem. Unfortunately I'm not able to reproduce it, maybe I'm missing something?

I have this sample grammar:

grammar Issue14;

@header {
package org.antlr.intellij.adaptor.issue14;
}

stat: NUMBER*;

NUMBER: [0-9]+;

LINECONTIUNE: '&' '\n' -> channel(HIDDEN) ;

WS: [ \t\r\n]+ -> channel(HIDDEN);

With the following ParserDefinition:

public class Issue14ParserDefinition implements ParserDefinition {

    public Issue14ParserDefinition() {
        PSIElementTypeFactory.defineLanguageIElementTypes(
                Issue14Language.INSTANCE,
                Issue14Lexer.tokenNames,
                Issue14Parser.ruleNames
        );
    }

    @NotNull
    @Override
    public Lexer createLexer(Project project) {
        return new ANTLRLexerAdaptor(Issue14Language.INSTANCE, new Issue14Lexer(null));
    }

    @Override
    public PsiParser createParser(Project project) {
        return new ANTLRParserAdaptor(Issue14Language.INSTANCE, new Issue14Parser(null)) {
            @Override
            protected ParseTree parse(Parser parser, IElementType root) {
                return ((Issue14Parser) parser).stat();
            }
        };
    }
...

The following input:

1
2
&
3

Is parsed as:

FILE
  ANTLRPsiNode(stat)
    PsiElement(NUMBER)('1')
  PsiElement(WS)('\n')
  PsiElement(NUMBER)('2')
  PsiElement(WS)('\n')
  PsiElement(LINECONTIUNE)('&\n')
  PsiElement(NUMBER)('3')

That's your expected behavior, right?

from antlr4-intellij-adaptor.

adesutherland avatar adesutherland commented on June 3, 2024

Thank you - the only difference is that in my grammar the newline is significant. Should not make a difference, of course.

I will try and reproduce ...

from antlr4-intellij-adaptor.

adesutherland avatar adesutherland commented on June 3, 2024

First thank you Bastien for replying so swiftly :-)

Secondly, after reviewing your code I decided to look more closely at my ParserDefinition (which I still don't understand!) that is based on the jetbrains-plugin-sample.

So I changed

/** "Tokens of those types are automatically skipped by PsiBuilder." */
@NotNull
public TokenSet getWhitespaceTokens() {
	return WHITESPACE;
}

To include LINECONTINUATION in the returned token set. This seems to fix the problem :-)

What I cannot say is why this changes ANTLR (?!) but is it possible that the adaptor somehow sends the tokens via Intellij or something, and loses the channel stuff?

Thirdly, do you mind if I ask two supplementary questions?

  1. Do I have to list all the tokens I want to use for syntax highlighting in the ParserDefinition? Why would I bother if I have the logic in the SyntaxHighlighter anyway? (I fear this is a stupid question!)

  2. Is it possible to syntax highlight based on Parser Rules? This language (not may fault!) allows keywords to be identifiers. Therefore I have an 'id' parser rule that combines lexer 'ID' and a bunch of keywords. I would love to match to the parser 'id' rule, rather than list all the tokens in the SyntaxHighlighter (and of course sometimes a keyword is a keyword, sometimes an identifier ...)

Thanks again

Adrian

from antlr4-intellij-adaptor.

bjansen avatar bjansen commented on June 3, 2024

OK, now I understand your original problem :)
Tokens in the HIDDEN channel are not supposed to be passed to the generated ANTLR parser, but because IntelliJ did not treat your LINECONTINUATION as a whitespace, that token was ultimately sent to the ANTLR parser and probably broke everything.

This behavior is documented in the IntelliJ SDK docs:

An important feature of PsiBuilder is its handling of whitespace and comments. The types of tokens which are treated as whitespace or comments are defined by the methods getWhitespaceTokens() and getCommentTokens() in the ParserDefinition class. PsiBuilder automatically omits whitespace and comment tokens from the stream of tokens it passes to PsiParser, and adjusts the token ranges of AST nodes so that leading and trailing whitespace tokens are not included in the node.

And also in the Javadoc of org.antlr.intellij.adaptor.lexer.PSITokenSource#nextToken:

/** ...
* So, whitespace and comments (typically hidden channel) will look like
* real tokens. Jetbrains uses {@link ParserDefinition#getWhitespaceTokens()}
* and {@link ParserDefinition#getCommentTokens()} to strip these before
* our ANTLR parser sees them.

*/

With that in mind, we can infer the following rule: every token that is sent to the HIDDEN channel should probably be declared as a comment or whitespace in your ParserDefinition.

As for your other questions:

  1. You only have to declare 'special tokens' like comments, whitespaces and string literals in your ParserDefinition because IntelliJ will do special things with them. The Javadoc in ParserDefinition explains pretty well how it works. Syntax highlighting is indeed configured in your SyntaxHighlighter, not in the ParserDefinition.
  2. Yes, in that case you need an Annotator. I believe Java constants (static final fields) are highlighted that way. See the SDK docs for more info. If any keyword can also be used as a regular identifier, then you'll probably want to completely replace your SyntaxHighlighter with an Annotator.

BTW JetBrains provides a very good tutorial to help you understand the basics of custom language plugins. There's also a Gitter chat were people from JetBrains and other plugin developers can answer your questions if they are related to general plugin development.

from antlr4-intellij-adaptor.

adesutherland avatar adesutherland commented on June 3, 2024

Many thanks for your comprehensive answer - it is really appreciated. As you know sometimes hacking something together quickly is needed (!) but I will do my homework now.

:-)

from antlr4-intellij-adaptor.

bjansen avatar bjansen commented on June 3, 2024

You're welcome!

from antlr4-intellij-adaptor.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.