Git Product home page Git Product logo

Comments (3)

bleichenbacher-daniel avatar bleichenbacher-daniel commented on July 23, 2024

I don't think any changes are necessary here.

What the test is doing is to generate signatures, select a subset from those signatures based on timing information, and then
checks if the k's used to generate those signatures are biased. If it is possible to select DSA signatures with small k's by selecting signatures that were generated faster than others, then the implementation has a weakness.

If an implementation uses uniformly distributed k's for DSA and ECDSA and does not leak timing information about the nonce then the test selects subsets of signatures with uniformly distributed k's and hence the result should be a normal distribution. Any disturbing factor such as garbage collection, warmup, load of the test server, overheating etc. does not change this distribution if the implementation is correct. This is an important property of the test, since its goal is be run regularly as a unit test. External influences must not be able to lead to false positives. Noise just makes it more difficult to detect a bias.

If the test result deviates significantly from a normal distribution, then this either means just bad luck or an actual bias. I suspect that the larger variance of the test results reported above were just caused by a small sample size.

There are a number of things that could potentially be done to improve the accuracy of the test. Obviously, generating more signatures gives better results. Better timing information would help, but unfortunately it is often difficult to influence the test environment. More detailed timing (e.g., time spent in particular functions) would also allow to improve the test.

from wycheproof.

ascarpino avatar ascarpino commented on July 23, 2024

If an implementation uses uniformly distributed k's for DSA and ECDSA and does not leak timing information about the nonce then the test selects subsets of signatures with uniformly distributed k's and hence the result should be a normal distribution. Any disturbing factor such as garbage collection, warmup, load of the test server, overheating etc. does not change this distribution if the implementation is correct. This is an important property of the test, since its goal is be run regularly as a unit test. External influences must not be able to lead to false positives. Noise just makes it more difficult to detect a bias.

I would absolutely disagree with the premise that external factors, like warmup, gc, server load, etc., do not change the distribution. With the randomness of K, noise could be introduced at unfortunate times. That does not show weakness in the implementation, it shows weakness in the test. The test does try to mitigate some of this by a large allowance for sigma. But as the above results show, it could be hard for that allowance to overcome a 10x performance differences if certain lengths of K occur at the wrong time. For example, there maybe 100 small K values or 1000 in a test run. Many of those small K's may be in the first half of the test or the latter.

The lack of a warmup also fails to take intrinsics into consideration. Once the C2 compiler decides the method is hot, the intrinsic will change the performance values and disrupt the results distribution. That is not a weakness in the implementation, that's a failure to test during normal system operation.

If the test result deviates significantly from a normal distribution, then this either means just bad luck or an actual bias. I suspect that the larger variance of the test results reported above were just caused by a small sample size.

The results were generated with 50000 iteration that the wycheproof test uses. It's true that more iterations will reduce the influence of noise, but a warmup would reduce the biggest noise influencer and not result in a significantly longer test run.

There are a number of things that could potentially be done to improve the accuracy of the test. Obviously, generating more signatures gives better results. Better timing information would help, but unfortunately it is often difficult to influence the test environment. More detailed timing (e.g., time spent in particular functions) would also allow to improve the test.

Accept what I suggested or not, that is your decision.

from wycheproof.

bleichenbacher-daniel avatar bleichenbacher-daniel commented on July 23, 2024

The point I wanted to make is that there can not be a test failure because of noise. If the implementation is correct then expected result will be close to a normal distribution with variance 1.

Too much noise can of course hide timing leaks. By selecting the signatures with the shortest timing the test eliminates the biggest influences of noise without the need to examine the environment. Slow signatures during startup or during garbage collection are most likely not used. As long as their number is small they don't have a significant influence. Also, if the test becomes slower in the middle because of heavy other load on the machine then the result will be computed just from the 25'000 signatures generated during the quiet time. This can miss a bias, but I can't lead to false positives.

The current setup of the test is for continuous testing. If a randomized test is repeated many times then it is important that the probability of false positives is small. Hence the large threshold. For other usages it might be reasonable to use a smaller threshold.

from wycheproof.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.