Git Product home page Git Product logo

Comments (10)

makenowjust avatar makenowjust commented on June 7, 2024 1

It is essential. ReDoS detection is a difficult problem and a heavy task. The canonical algorithm takes $O(n^6)$ against a state size of an automaton in the worst case. However, security is an important issue, so we try to use their algorithms if possible.

There are two solutions:

  1. Use the fuzz checker explicitly. It is faster than the default automaton checker. However, it reports false-negative in probabilistic.
  2. Use the cache option. Recently I added the cache option to the ESLint plugin. It saves an analysis result of a regex and reuses the result when the regex is not changed.

My recommendation is the cache option. This option is added just for improving DX in editors. I hope you give it a try.

from recheck.

makenowjust avatar makenowjust commented on June 7, 2024 1

check is a young feature and a little unstable, so I turned it off in the default settings. Please see the doc for other questions.

settings is for the configuration values shared by some rules, but eslint-plugin-redos exposes only one rule. This, settings does not fit in this case. And, since cache in this plugin is per regex, it is slightly different from other plugins. There is also no need to put cache in the settings in view of this.

At first, I thought that it would be great and accurate if the automaton checker checked all regular expressions. However, as I continued to experiment, I realized that many regexes could not be checked in a realistic amount of time. Algorithms can be simplified for speed, but this greatly compromises accuracy. In this context, the fuzz checker is the most accurate and fastest detection method. Actually, fuzz checker can also analyze regexes containing theoretically intractable extensions such as back-reference.
According to the experiments in this repository, fuzz checker reports the same results as automaton checker for 98.8% of ~10000 regexes analyzed by automaton checker, and no timeout is reported.

from recheck.

makenowjust avatar makenowjust commented on June 7, 2024 1

The goal of fuzz is not to provide a fast checker but one that returns results in a realistic amount of time. Therefore, the time (and the accuracy) can be adjusted by parameters. Currently, it is adjusted according to the default timeout (10 seconds). In most cases, however, a check is completed within 0.1 seconds.

Indeed, the aggressive option seems unnecessary in many cases. I will turn conservative to the default. Thank you for your opinion.

from recheck.

makenowjust avatar makenowjust commented on June 7, 2024 1

I have no plan to publish. You can already use the beta version. I'd like to concentrate on other research (but related to ReDoS) for a month. Please don't reply to this thread.

from recheck.

thernstig avatar thernstig commented on June 7, 2024

Ah, I thought "plugin:redos/recommended" set the cache option, but guess it does not then. Should it? Great feature that you added it though, makes great sense! How does it store the cache? I.e. is it only in memory, or could someone use it in e.g. CI to save to a file at a specific location?

Also any reason you did not use the more universally used settings as seen here https://eslint.org/docs/latest/use/configure/configuration-files?

(Using fuzz does not seem great as we prefer to avoid false-negatives).

from recheck.

thernstig avatar thernstig commented on June 7, 2024

@makenowjust is the fuzz checker faster? Meaning that caching is only really important in the automaton case?

As the docs describe, since using aggressive can contain false positives caching results, it is not something I would wish to use. So what was the reason to include that option at all as I believe correctness should always trump speed here.

So I am curious as to if fuzz is "fast", then there is no need to cache those results. So the aggressive is not necessary for that case either.

No need to get super-detailed in the answer if possible :D

from recheck.

thernstig avatar thernstig commented on June 7, 2024

I think we can close this issue then, unless you want to close it in reference to your change of the default.

from recheck.

thernstig avatar thernstig commented on June 7, 2024

@makenowjust it seems you have not released 4.5.0 yet so there is no way for us to use the new cache options. When is the plan to release 4.5.X?

from recheck.

thernstig avatar thernstig commented on June 7, 2024

@makenowjust? 😄

from recheck.

thernstig avatar thernstig commented on June 7, 2024

I believe I have been fairly helpful, finding bugs and writing docs issues (and I plan to push PRs soon) and testing various aspects in regards to making this the defacto lib in ESLint by creating performance tests.

I am interested in verifying the cache functionality, as well as testing #799 to see if it can improve the situation for ESlint, and thus continue to push it as the default experience.

I found it appropriate to reply here as this issue ended up changing the default caching strategy, which is part of what I needed to test in the release version.

This is my last reply to this thread. (First time I get asked to stop replying when trying my best to be helpful).

from recheck.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.