open-quantum-safe / profiling Goto Github PK
View Code? Open in Web Editor NEWNetwork-level performance testing of post-quantum cryptography using the OQS suite
Home Page: https://openquantumsafe.org/benchmarking
License: MIT License
Network-level performance testing of post-quantum cryptography using the OQS suite
Home Page: https://openquantumsafe.org/benchmarking
License: MIT License
Automation scripts for image deployment for:
The oqs-ref shared library is copied to liboqs.so.LIBOQS_VERSION instead of liboqs.so.$LIBOQS_VERSION:
profiling/perf/scripts/run-tests.sh
Line 33 in 9ff4911
Currently, separate architectures are build/pushed to dockerhub differently: x86_64 is done automatically via CI, arm64 is built&pushed manually from an ARM64 VM. A true multiarch docker image would be ideal.
First tests to do both from within CCI failed:
docker buildx
branch takes way to long (and then times out).Another approach now would be to cross-build everything and only then insert the resultant binaries to the respective architecture base docker images. This apparently only works for liboqs
in Debian (not alpine as currently used for profiling) and requires additional investigation as to how to cross-build S3 access (for storing test run results) and OpenSSL.
Suggestions/alternative ideas solicited how to achieve this goal (@jschanck ?)
Changing the date doesn't change the (profiling run) metadata displayed.
Moving a discussion from a PR #20 to an issue:
@baentsch : Question to @dstebila We're collecting all data since Oct 4 into the website display (and the file site.tgz): Shall we keep doing this or become more selective? "Oct 4" is actually a parameter in gen_website.sh...
@dstebila : Unclear to me. Let's reflect on the purpose of presenting it as a time series. I understand it serves two purposes: seeing how performance changed after important commits were merged, and providing a visual average over time (effectively adding more data points).
Agreed. So what about adding a mechanism to selectively set dates that shall become part of the visualization export (encoded in gen_website.sh? ) and which could be manually amended/checked in to github as and when substantial changes to the code base occur and as and when the current number of test runs shown becomes too large to derive meaningful insight from), e.g., a file containing a list of dates (instead of the single start date currently passed as parameter to the HTML-generator python script)? This would be a whitelist approach at the generating end (dropping data from export).
An alternative could be a simple "pruning" script deleting unwanted/redundant data/date points from the current daily exports, i.e., a blacklist approach that could be done at the receiving end (still retaining all data points without visualizing all of them). My preference would be on this, second option.
@dstebila: Any preference? Alternative suggestions?
As per suggestion by Alex.
First thoughts: Probably different docker images; visualization: Same graph (some work) or separate ones (easy)
docker run -v ~/resHandshake/:/opt/test/results oqs-perf python3 /opt/test/handshakes.py
That the command runs without errors
Generating a oqs_sig_default private key
writing new private key to 'CA.key'
Generating a oqs_sig_default private key
writing new private key to '/opt/test/server.key'
Signature ok
subject=CN = localhost
Getting CA Private Key
Using default temp DH parameters
ACCEPT
Using default temp DH parameters
ACCEPT
Using default temp DH parameters
ACCEPT
Using default temp DH parameters
ACCEPT
Using default temp DH parameters
ACCEPT
Using default temp DH parameters
ACCEPT
Using default temp DH parameters
ACCEPT
Using default temp DH parameters
ACCEPT
Using default temp DH parameters
ACCEPT
Using default temp DH parameters
ACCEPT
Using default temp DH parameters
ACCEPT
Using default temp DH parameters
ACCEPT
Using default temp DH parameters
ACCEPT
Error with command: "-curves "
Generating a p256_oqs_sig_default private key
writing new private key to 'CA.key'
Why is it throwing this error? What is "-curves" needed for?
With the switch to OpenSSL3+oqsprovider, heap memory use during handshaking jumped by 200-300%. Investigate and fix. Most likely culprit: open-quantum-safe/oqs-provider#155
While running the performance tests, I noticed that the Classic McEliece algorithms are currently not included in the openSSL benchmarks. Since Classic McEliece is considered a promising candidate for post-quantum cryptographic schemes, it would be valuable to include these algorithms in the OpenSSL benchmarking tools.
As per this discussion.
M1 testing is running as per regular email results, but M1 numbers are empty, see e.g., https://openquantumsafe.org/benchmarking/visualization/2023-03-17/speed_kem.json.
Since #92 landed, classic algorithm runs results are not displayed:
even though the values seem to be available:
Due to the unavailability (at time of creating the tests) of M1 VMs in AWS and unlike the timed AWS-based runs for x64
and aarch64
, the profiling run on m1
is currently triggered by a cron
job on a dedicated laptop like this:
docker run --privileged openquantumsafe/oqs-perf bash -c "cd /opt/test && ./run-tests-m1.sh" > /Users/baentsch/profiling/docker-run-m1.log 2>&1
In particular, this means the dedicated Dockerfile-m1
dockerfile is not used for M1 profiling. This may or may not be sensible -- but at least confuses me (why do we still have this file)? If we decide to keep this file, the corresponding image should be built in CI and be used (and not the more generic aarch64
linux image). Also, would running the code in a comparable (VM-based) manner be possible? In which infrastructure? How do we ascertain that the functionality being tested on all platforms remains in sync with the main Dockerfile
(used for x64
)?
cpuinfo
is empty for aarch64
runs. Fix this to allow investigating execution deviations between different runs.
Hi,
while looking at the TLS handshake performance data at https://openquantumsafe.org/benchmarking/visualization/handshakes.html (specifically, from 2023-06-24 but other days have the same potential discrepancy) I came across a strange result. The reference code on x86 (and sometimes aarch64) is (almost) always faster (more handshakes per second) than the performance version for classical ECDHE (x25519/x448) with signature algorithm: Ed25519 or Dilithium2.
With Ed25519:
vs
or (aarch64)
vs
With Dilitihum2:
vs
When looking at post-quantum key exchanges it makes sense (performance code is faster than reference) but with classical key exchange it seems to be the opposite on x86.
However, when I select RSA2048 or ECDSAprime256v1 those results for ECDHE make sense (performance code is faster than reference) but then Kyber key exchange has the opposite effect (reference code Kyber + (RSA or ECDSA) is faster than performance code).
Is this a measurement error due to high variance? Is the "performance code" for classical key exchanges even different from the "reference code" since I would guess you use the OpenSSL implementation? Is there any other explanation for these strange results?
By the way, thank you for making and maintaining this project!
It would be useful to have performance numbers for non-PQ algorithms as a baseline comparison. This is present in some of the profiling operations but not all (e.g., handshake performance). This would include both non-PQ key exchange and non-PQ signatures.
Data as requested by @christianpaquin in #93 is being collected (see e.g. https://openquantumsafe.org/benchmarking/visualization/2023-07-08/handshakes.json) but in an inconsistent manner: Many algorithm combinations deliver results on x64
but not aarch64
and m1
and in many cases it's the opposite. Investigation warranted -- as on the many other issues in this project. Not fun being alone looking at these...
Renaming the rainbow family leads to (new algorithms') data points to not be reliably visualized. Data points do exist, though, as per raw data file(s).
To reproduce, run openssl speed rsa3072_sphincsshake256128frobust
As per April 3, the new "portability" build flags became active in the profiling docker image and per April 8 an update to this image did (incorporating open-quantum-safe/liboqs#957). Both events are visible in the snapshots below, but performance in general never quite went back to previous levels (all x86_64): Would it be reasonable to investigated/improve this? Are build flags set incorrectly for creating the profiling Docker image? Is something else going wrong in building things? Is there a more general issue? Were things incorrect before?
SIKE&SIDH (performance) lost drastically with the last update:
Kyber (plain, non AES-version) distributable lost somewhat with the first change, but never recovered:
Frodo distributable, also lost:
HQC (performance and distributable) in turn benefited from both changes:
Dilithium AES benefitted while plain Dilithium dropped (performance and distributable)
Tagging @jschanck for thoughts
It'd be good to add hybrid KEX as well to the TLS benchmarking tests. If resources are constrained, then only enabling the Kyber variants (and perhaps the fallback NTRU) would suffice. These are likely be the first deployed algs in practice, so this data would be insightful. Perf tests with these algs will also be run in the NCCoE projects, so having a basis for comparison would help.
It would be useful to add the P-521 curve as a standalone choice for KEX and auth in TLS (currently, only P-256 and P-384 options are available). This is for performance testing in the NCCoE project, to compare with the L5 CNSA 2.0 suite (Kyber1024/Dilithium5).
As per question 4 in open-quantum-safe/liboqs#1304
Run profiling on Apple M1 machine.
Issues known:
checking for the kernel version... unsupported (21.3.0)
configure: error: Valgrind works on Darwin 10.x-20.x (Mac OS X 10.6-10.11 and macOS 10.12-11.0)
--> Memory tests cannot be run
--> Arguably pointless to run more than baseline (OQS_DIST_BUILD=OFF) tests
This is to collect work-in-progress information on speed-JSON file visualization.
Current version accessible at https://test.openquantumsafe.org/performance.html
Feedback/improvement suggestions by @dstebila :
Unlike on x64
VMs, CPU frequency is not output by cat /proc/cpuinfo
on AWS aarch64 VMs. This issue is to find a way how this information can be obtained and added to the profiling run results.
Currently profiling is done with OQS_PORTABLE_BUILD set and we show all measurements of reference and optimized code with this setting.
Question is whether we want to
-mnative
.Opinions, thoughts, further alternatives welcome as reply to this issue. Proposals as to how to visualize alternatives are also solicited.
PS: I took a look at the source code here and here and found that the labels are not consistent with those I saw on the benchmark page.
At the end of each performance test collection run, create an externally accessible/downloadable .tgz file with all HTML+JavaScript+JSON files. Easiest would be an S3-based Web-folder readable to the world: Would that be OK for you @dstebila ? You could then wget this to a location of your choice. Alternatively, we could push (scp?) to a server where you want it.
Possibly combine with direct load off S3
There are some surprising performance differences when running the performance container (openquantumsafe/oqs-perf) in AWS EC2 manually and under cron control.
AWS manual run (docker run -it openquantumsafe/oqs-perf /opt/oqssa/bin/speed_sig
)
[ec2-user@ip-10-0-0-13 ~]$ docker run -it openquantumsafe/oqs-perf /opt/oqssa/bin/speed_sig
Configuration info
==================
Target platform: x86_64-Linux-5.4.0-47-generic
Compiler: gcc (9.3.0)
Compile options: [-Werror;-Wall;-Wextra;-Wpedantic;-Wstrict-prototypes;-Wshadow;-Wformat=2;-Wfloat-equal;-Wwrite-strings;-O3;-fomit-frame-pointer;-fdata-sections;-ffunction-sections;-Wl,--gc-sections;-Wbad-function-cast]
OQS version: 0.4.0
Git commit: 1d08c9d6ab696c9d50e36231447d56ddc05735d6
OpenSSL enabled: Yes (OpenSSL 1.1.1g 21 Apr 2020)
AES: OpenSSL
SHA-2: OpenSSL
SHA-3: OpenSSL
CPU exts active: AES-AVX-AVX2-BMI-BMI2-POPCNT-SSE-SSE2-SSE3
Speed test
==========
Started at 2020-09-26 07:05:36
Operation | Iterations | Total time (s) | Time (us): mean | pop. stdev | CPU cycles: mean | pop. stdev
------------------------------ | ----------:| --------------:| ---------------:| ----------:| -------------------------:| ----------:
DILITHIUM_2 | | | | | |
keypair | 29931 | 3.000 | 100.233 | 26.477 | 288371 | 76397
sign | 5779 | 3.000 | 519.196 | 384.120 | 1503404 | 1113889
verify | 30073 | 3.000 | 99.759 | 2.062 | 287081 | 5693
DILITHIUM_2 | | | | | |
keypair | 31144 | 3.000 | 96.329 | 1.896 | 277202 | 5273
sign | 5769 | 3.000 | 520.023 | 367.729 | 1505764 | 1066415
verify | 30064 | 3.000 | 99.789 | 1.827 | 287174 | 4989
DILITHIUM_3 | | | | | |
keypair | 21025 | 3.000 | 142.690 | 2.821 | 411638 | 7948
sign | 3912 | 3.000 | 766.960 | 575.565 | 2221895 | 1669008
verify | 19896 | 3.000 | 150.786 | 34.302 | 435014 | 99061
DILITHIUM_4 | | | | | |
keypair | 15889 | 3.000 | 188.813 | 4.764 | 545337 | 13586
sign | 4235 | 3.002 | 708.912 | 438.295 | 2053503 | 1270955
verify | 15501 | 3.000 | 193.542 | 3.386 | 559032 | 9592
Falcon-512 | | | | | |
keypair | 151 | 3.018 | 19985.013 | 7141.066 | 57953387 | 20709488
sign | 532 | 3.004 | 5647.523 | 33.441 | 16375146 | 96541
verify | 49284 | 3.000 | 60.873 | 1.745 | 174499 | 4840
Falcon-1024 | | | | | |
keypair | 56 | 3.005 | 53657.536 | 21900.956 | 155603090 | 63511470
sign | 244 | 3.005 | 12315.061 | 24.312 | 35710758 | 69744
verify | 24929 | 3.000 | 120.342 | 2.393 | 346854 | 6744
Configuration info
==================
Target platform: x86_64-Linux-5.4.0-47-generic
Compiler: gcc (9.3.0)
Compile options: [-Werror;-Wall;-Wextra;-Wpedantic;-Wstrict-prototypes;-Wshadow;-Wformat=2;-Wfloat-equal;-Wwrite-strings;-O3;-fomit-frame-pointer;-fdata-sections;-ffunction-sections;-Wl,--gc-sections;-Wbad-function-cast]
OQS version: 0.4.0
Git commit: 1d08c9d6ab696c9d50e36231447d56ddc05735d6
OpenSSL enabled: Yes (OpenSSL 1.1.1g 21 Apr 2020)
AES: OpenSSL
SHA-2: OpenSSL
SHA-3: OpenSSL
CPU exts active: AES-AVX-AVX2-BMI-BMI2-POPCNT-SSE-SSE2-SSE3
Speed test
==========
Started at 2020-09-26 02:51:11
Operation | Iterations | Total time (s) | Time (us): mean | pop. stdev | CPU cycles: mean | pop. stdev
------------------------------ | ----------:| --------------:| ---------------:| ----------:| -------------------------:| ----------:
DILITHIUM_2 | | | | | |
keypair | 3875 | 3.000 | 774.205 | 7673.649 | 2243411 | 22253375
sign | 716 | 3.000 | 4190.426 | 17596.804 | 12150367 | 51030584
verify | 3578 | 3.000 | 838.465 | 7990.712 | 2429622 | 23172908
DILITHIUM_2 | | | | | |
keypair | 3893 | 3.000 | 770.637 | 7654.745 | 2233039 | 22198558
sign | 712 | 3.001 | 4214.861 | 17634.441 | 12221155 | 51139538
verify | 3752 | 3.000 | 799.576 | 7794.350 | 2248827 | 22221764
DILITHIUM_3 | | | | | |
keypair | 2683 | 3.009 | 1121.493 | 9204.791 | 3155331 | 26244746
sign | 468 | 3.001 | 6412.256 | 21597.235 | 18593514 | 62631559
verify | 2611 | 3.000 | 1149.017 | 9331.701 | 3330303 | 27061729
DILITHIUM_4 | | | | | |
keypair | 1961 | 3.000 | 1529.850 | 10743.928 | 4434616 | 31157112
sign | 529 | 3.001 | 5672.085 | 20304.290 | 16446986 | 58882089
verify | 1930 | 3.000 | 1554.496 | 10828.900 | 4506234 | 31403574
Falcon-512 | | | | | |
keypair | 18 | 3.202 | 177867.500 | 60619.326 | 515813317 | 175792762
sign | 67 | 3.093 | 46157.448 | 43652.909 | 133853713 | 126593069
verify | 6174 | 3.000 | 485.916 | 6085.696 | 1407429 | 17648388
Falcon-1024 | | | | | |
keypair | 8 | 3.406 | 425755.500 | 110722.507 | 1234691953 | 321099133
sign | 31 | 3.095 | 99853.806 | 22401.428 | 289572238 | 64964560
verify | 3095 | 3.000 | 969.308 | 8574.783 | 2809235 | 24866673
The execution was performed in the same AWS cluster (https://us-east-2.console.aws.amazon.com/ecs/home?region=us-east-2#/clusters/oqs-speed/scheduledTasks) with c4.large instances.
Checklist for renaming this project (to "profiling"?)
git remote set-url origin [email protected]:openquantumsafe/profiling.git
(or simply clone anew).Docker filename is independent and can remain as-is ("oqs-perf"); AWS-deploy naming ("oqs-test") also is independent of this name; CCI build-process of this project also does not already seem to be triggered externally (neither in openssl nor liboqs).
@dstebila @xvzcf Please review/amend as you see fit. My check comprised also of going through all references to the word "speed" in all OQS subprojects I have checked out. In that process I found some "speed" related references that make me wonder whether we might want to integrate them here, too (e.g., boringssl speed
)?
https://openquantumsafe.org/benchmarking/visualization/speed_kem.html doesn't show marked performance improvements for aarch64 Kyber as should be visible due to open-quantum-safe/liboqs#1117 now being part of profiling image -> config options not properly set?
@Martyrshot: What performance changes would you expect based on your local tests?
See https://openquantumsafe.org/benchmarking/visualization/2024-04-02/speed_kem.json.
I think I fixed this by reordering the Docker cleanup commands in the scripts run by cron jobs on various platforms. We'll see when the tests run next, I guess.
TODO in future profiling updates/fixes: add the trigger scripts to GitHub somewhere so we can have proper version control for them.
Before doing this,
With the switch to oqsprovider+OpenSSL3, handshaking performance in general dropped by 20%-30%. Need to investigate how classic algorithm performance (independent of oqsprovider) and PQ alg performance changes. For this, need to fix #94 .
While hybrid KEM handshakes are recorded as per #86 (see e.g., https://openquantumsafe.org/benchmarking/visualization/2023-01-27/handshakes.json) they are not (yet) displayed as the oldest test run defines the algorithms shown. Thus it takes time for new algorithms to become visible. This PR is to suggest changing this logic (at least for single-day run result display).
Now that additional AWS credits are available, decide which additional things to profile.
Options: Different CPU types (AMD, ARM (which), ...?) New OSs (OSX, Windows, RedHat, ... -- some possibly on ARM64) further network simulations, ...???
Input solicited: Additional options? Priorities?
Edit: Instance types: https://aws.amazon.com/ec2/instance-types/
As per the discussion to reduce number of profiling runs while
keep running on a longer periodic bases (e.g., every few weeks) rather than stopping entirely to avoid code rot
the data collection, deviation checking and presentation logic must handle arbitrary numbers of days without profiling runs. Currently, the profiling results of the last x(=10) calendar days are collected and displayed and the last y(=5) calendar days are used to check for performance deviations. This logic would not properly handle days/dates without any profiling runs, i.e., with longer periodic bases.
Suggestion is to retain the basic logic but permit arbitrary numbers of days without profiling runs. So, in future, the last x complete profiling runs (on all supported platforms, but on arbitrary dates in the past) are collected for presentation and the last y complete profiling runs are used to check performance deviation of any single new run.
Following up on discussions in open-quantum-safe/liboqs#928 :
it would be helpful for the OQS team to clarify what the purpose of the tests are:
- Comparing algorithms vs Comparing algorithm evolution
- Testing algorithm runtime variation for a given message vs testing algorithms runtime over a distribution of messages
My personal initial attempt to answer:
Algorithm evolution may not be as interesting as a comparison across algorithms (and their variants)
Runtime variations for any algorithm (or algorithm variant) with any kind of dependency would be interesting to highlight (possibly as a new set of tests and visualizations).
open-quantum-safe/liboqs#1361 changed the default build library options to run optimized code. Thus, the designation "-ref" for "reference implementation" as being the liboqs
default build option is no longer valid. This means, "-ref" must now be explicitly built with "-DOQS_DIST_BUILD=OFF".
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.