I'm working with a <a href="https://github.com/nene/sql-parser-cst/blob/master/src/par

I wonder if <a class="issue-link js-issue-link" data-error-text="Failed to load title"

reportInfiniteRecursion pass is excessively slow about peggy HOT 8 OPEN

nene commented on July 19, 2024

reportInfiniteRecursion pass is excessively slow

from peggy.

Comments (8)

nene commented on July 19, 2024

Did some more investigating. In my grammar I have rules like these:

select_columns
  = head:column_list_item tail:(__ "," __ column_list_item)* trailing:(__ "," __ empty)? {
      ...
    }

column_list_item
  = expr:star_or_qualified_star kw:(__ EXCEPT __) columns:paren$list$column {
    ...
  }
  / expr:star_or_qualified_star kw:(__ REPLACE __) columns:paren$list$alias$expr {
    ...
  }
  / star_or_qualified_star
  / expr:expr alias:(__ alias)? {
    ...
  }

expr = mysql_assign_expr / or_expr

mysql_assign_expr = ... // ends up referencing same stuff that or_expr references

or_expr = ...

It takes from reportInfiniteRecursion:

4.1 seconds to check the select_columns rule
3.9 seconds to check the column_list_item rule.
3.7 seconds to check the expr rule.

Essentially it comes down to this main expr rule being really big and complex and taking a whole lot of time to fully traverse.

from peggy.

Mingun commented on July 19, 2024

Yes, it definitely can be optimized. I think that the problem is because it tried to check each rule independently and as result can traverse the same nodes many times.

The best strategy would be to introduce new pass that would calculate some useful properties of each AST node, which includes:

lookahead: how many characters this node can lookahead? For example, literal always lookahead by literal.length, and character class by 1 character
minMatched: the minimal characters that would consumed if that node matched. For example, for . (dot) it is 1 character and for .? it is 0 characters
maxMatched: the maximum characters that would consumed if that node matched. For example, for . (dot) and .? it is 1 character and for .+ it is unfinity characters
haveUserCode: is this node have user actions or predicates (which could have side-effects)?

Then to find left recursion we need to find rules where minMatched is 0.

from peggy.

hildjj commented on July 19, 2024

What we're trying to do is prove that there is NOT a loop, which means we only need to find the first loop of any size. In this case, speed matters more than completeness.

I bet if we took one O(n) pass over the rules to generate an adjacency matrix indexed by rule number, then did a depth-first search on each non-seen rule, we'd be able to prove there were no loops more quickly.

from peggy.

hildjj commented on July 19, 2024

In the same pass, we could prove that certain rules were not reachable by any of the allowedStartRules, marking them for removal.

from peggy.

hildjj commented on July 19, 2024

I wonder if #467 will help with this?

from peggy.

nene commented on July 19, 2024

I don't think this will help at all. Though I might misunderstand - I didn't test.

The main problem concerns the case where no errors are found. Which in my case is like 99.9999% of the time. Frankly, I wouldn't actually mind if it were slow when errors are present.

I assume that moving this to prepare pass will also not effect the performance in any way.

from peggy.

hildjj commented on July 19, 2024

Nod. There are a couple of speedups for big grammars that I can add after 4.0, but I expect most of those to have minimal impact to this. Someone will have to help with the approach Mingun suggested above, I'm unlikely to get to that myself.

from peggy.

nene commented on July 19, 2024

I've been thinking of tackling this, but each time I look at it, the problem sort of overwhelms me as I can't seem to even fully wrap my head around the current implementation. Mainly me not being familiar with the code base :(

from peggy.

reportInfiniteRecursion pass is excessively slow about peggy HOT 8 OPEN

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent