rustcrypto / hashes Goto Github PK
View Code? Open in Web Editor NEWCollection of cryptographic hash functions written in pure Rust
Collection of cryptographic hash functions written in pure Rust
Both the rust-crypto
and digest
crates have a Digest
trait. They're both listed as being authored by "The Rust-Crypto Project Developers".
If I'm defining a new hash algorithm, which trait should I implement?
List of "would be nice to have" hash functions:
It can be changed based on discussion.
Encryption that is resistant to Quantum computers. Microsoft's open source implementation:
https://www.microsoft.com/en-us/research/project/sidh-library/
Instead of this:
let foo = {
let mut sh = Sha1::default();
sh.input(&bar);
sh.input(&baz);
sh.input(&buz);
sh.result()
};
It would allow for this:
let foo = Sha1::builder()
.input(&bar)
.input(&baz)
.input(&buz)
.result()
This could be added entirely in the digest
crate.
in digest/src/digest.rs line 72
if bytes_read != buffer.len() {
that will terminate on short read, which breaks with asynchronous Read, which may not have the full buffer at the current time, but will have more later.
it should terminate when bytes_read == 0 instead (EOF)
see https://doc.rust-lang.org/std/io/trait.Read.html#tymethod.read
Hey, I'm very new to Rust so I apologize if this turns out to be a language misunderstanding rather than a package issue...
I'm trying to dynamically create instances of algorithm structs that implement the Digest
trait, hold on to them using a Box<Digest>
, and interact with them via the Digest
interface.
In other words, something like this:
let hasher = match "one of the algorithms" {
"sha256" => Box::new(Sha256::new()) as Box<Digest>,
"sha512" => Box::new(Sha512::new()) as Box<Digest>
// ... etc
};
I sort of get why that doesn't work (missing associated types), but even if I fill those in like:
"sha256" => Box::new(Sha256::new()) as Box<Digest<<OutputSize=U32, BlockSize=U64>>>,
I'm still left with an error: the trait 'digest::Digest' cannot be made into an object
.
After poking around it seemed like this might be because of the way Digest
was implemented? Looking at the docs it doesn't seem to have any static methods, but maybe it does or there's something else I'm missing?
Again, sorry to bother you with this!
Hi,
I need to use blake2 for key stretching to a 64B value. Is VariableOutput the way to go or am I misunderstanding its purpose? Is effectively making me use BLAKE-512?
Thank you for your help!
It's possible to significantly improve performance of GOST94 with a relatively little work.
perf report
results:
--98.25%-- gost94::gost94::Gost94State::f::h77646fa46d6b3d1b
|
|--45.54%-- gost94::gost94::Gost94State::shuffle::hdd9f6ead0dcb2230
| |
| |--42.68%-- gost94::gost94::psi::h3fd72e14f2fd3fca
|
|--35.82%-- gost94::gost94::encrypt::h32d61af8e91b44b7
| |
| |--29.97%-- gost94::gost94::g::ha2fe1802e6e6b67f
| |
| |--26.89%-- gost94::gost94::sbox::hd99689d5581a89bb
|
|--8.02%-- gost94::gost94::a::h8a2cac7c5b28e727
|
|--3.59%-- gost94::gost94::p::h158268fffc4616a5
|
|--1.97%-- gost94::gost94::g::ha2fe1802e6e6b67f
sbox
can be optimized using 8-bit S-boxes instead of 4-bit ones. Additionally you can read this paper (Russian) about possible optimizations of block cipher used in GOST94.
psi
could be optimized by replacing unnecessary copy operations with cursor based approach.
Hi,
It looks like performance of blake2 compared to sha512 is very similar (while still faster):
Running target/release/deps/blake2b-2fc6bb02e2d63899
running 6 tests
test bench_16 ... bench: 45 ns/iter (+/- 6) = 355 MB/s
test bench_1k ... bench: 2,636 ns/iter (+/- 3) = 388 MB/s
test bench_256 ... bench: 685 ns/iter (+/- 31) = 373 MB/s
test bench_64 ... bench: 170 ns/iter (+/- 9) = 376 MB/s
test bench_64k ... bench: 174,027 ns/iter (+/- 3,512) = 376 MB/s
test bench_8k ... bench: 21,669 ns/iter (+/- 951) = 378 MB/s
running 6 tests
test bench_16 ... bench: 60 ns/iter (+/- 1) = 266 MB/s
test bench_1k ... bench: 3,092 ns/iter (+/- 101) = 331 MB/s
test bench_256 ... bench: 779 ns/iter (+/- 15) = 328 MB/s
test bench_64 ... bench: 204 ns/iter (+/- 9) = 313 MB/s
test bench_64k ... bench: 197,618 ns/iter (+/- 8,180) = 331 MB/s
test bench_8k ... bench: 24,689 ns/iter (+/- 896) = 331 MB/s
But https://blake2.net/ shows that performance should be about 3x faster.
Any ideas? It looks like blake2 does not use SIMD? Is it not needed? Is there a chance that blake2 crate here will be optimized better in future? Or it's just because recent processors already execute sha512 much faster?
Currently only RIPEMD-160 is implemented. It should be relatively easy to implement 128, 256 and 320 bit (added in #68) versions of algorithm based on 160 bit implementation.
Related links:
asm
feature was added to md5, sha1, sha2 and whirlpool crates, but current CI configuration does not test it.
Sha3's shake mode should provides variable length output determined at runtime, but it's currently a user defined type level numeric. I'd think the OutputSize
type of the Digest
trait should be replaced by a Output
type to fix this, but I have no looked into doing it.
Hi,
I use sha256 in vagga a lot. And I just noticed that current implementation of sha256 is much slower (2x-3x) than old sha2 = 0.1.2.
I do not post benchmarks here because I can't find repository for old code. Do you have a link?
Is there any chance to fix it?
When I compile sha2
version 0.7.1 with the asm
feature enabled, I get the following link error:
= note: Undefined symbols for architecture x86_64:
"_sha512_compress", referenced from:
sha2_asm::compress512::h2714fa0e6f190002 in libsha2-128ed50e2e0d1cae.rlib(sha2-128ed50e2e0d1cae.sha214.rcgu.o)
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
This is on macOS 10.13.4 with Rust 1.26.
To reproduce, run cargo new --bin test_sha2
and then edit the following files:
Cargo.toml
:
[package]
name = "test_sha2"
version = "0.1.0"
[dependencies]
sha2 = { version = "0.7", features = ["asm"] }
src/main.rs
:
extern crate sha2;
use sha2::{Sha512Trunc256, Digest};
fn main() {
let mut hasher = Sha512Trunc256::new();
let data = b"Hello world!";
hasher.input(data);
// `input` can be called repeatedly
hasher.input("String data".as_bytes());
// Note that calling `result()` consumes hasher
let hash = hasher.result();
println!("Result: {:x}", hash);
}
It would be nice to have a short security summary in each crate documentation which would include known insecurities, applicability, etc.
SHA extensions allow to significantly speed-up computations using hardware acceleration and they will be soon available on stable.
Currently the Digest
trait allows writing code that's generic with respect to the choice of hash function. But for an application that requires a certain output size, there's no way to encode that requirement into the type system (e.g., "This function requires a hash function with 512 bits of output").
Perhaps one way to solve this would be to introduce empty Digest256
, Digest512
, etc traits, which all inherit the Digest
trait. Implementing, say, Digest512
is then a declaration that that hash function has 512 bits of output, so that users can write functions which are generic over hash functions with a fixed output size.
Is this a good idea, or there a better way to do this?
std::hash::Hasher
can be derived for structs and is a standard hashing interface in rust. The standard interface only allows 64bit outputs but there's nothing stopping extra outpust tailored to specific hashes. So for ergonomic purposes wouldn't it make sense to have an adapter to allow using the Hasher
API?
Using either qemu or https://github.com/japaric/trust.
Tested with 0.7.1 (I cannot test with 0.8.0 atm).
I assume that this is an error. At least, it gets flagged as "odd" by the Firefox submission mechanism.
Apparently BLAKE2 includes parallel versions of the two hash functions to take advantage of multi-core:
https://github.com/BLAKE2/BLAKE2/blob/master/sse/blake2sp.c
It would be nice to have these to be able to take advantage of more cores when hashing large files.
The current Digest
trait looks like this:
pub trait Digest: Input + FixedOutput {
type OutputSize: ArrayLength<u8>;
type BlockSize: ArrayLength<u8>;
fn input(&mut self, input: &[u8]);
fn result(self) -> GenericArray<u8, Self::OutputSize>;
}
A common pattern, at least for me, is to take a hash for an entire file. This can be done something like:
let file = File::open(...).unwrap();
let mut hasher = Sha256::default();
// Read blocks of the file into memory and pass them into the hasher.
let mut buffer: [u8; 512] = Default::default();
loop {
let bytes_read = file.read(&mut buffer)?;
hasher.input(&buffer[..bytes_read]);
if bytes_read != buffer.len() {
break;
}
}
let hash = hasher.result();
I think this pattern may be common enough to justify extending the Digest
trait with a read_from
method, and providing a default implementation that is used across all implementors:
pub trait Digest: Input + FixedOutput {
fn read_from(&mut self, from: &mut std::io::Read) -> Result<(), std::io::Error> {
// Read blocks into memory until EOF (or io error), and pass into the `input` function.
}
}
This would replace the middle 8 lines I wrote in the above example with what in my opinion is a more ergonomic function call, and it generalizes to any type that implements std::io::Read
(e.g. sockets or Tcp streams).
Does this seem like a good idea? I'd be happy to implement it if those more invested in RustCrypto than myself approve.
I wrote a KangarooTwelve implementation. It's slow and has horrible code style, but passes the tests at least. Licenced CC0 so feel free to copy.
When trying to install tectonic
which depends on this:
error: `Self` and associated types in struct expressions and patterns are unstable (see issue #37544)
--> ~/.cargo/registry/src/github.com-1ecc6299db9ec823/md-5-0.4.3/src/lib.rs:40:9
|
40 | Self {
| ^^^^
Though, strangely, the Travis CI build is passing.
With stabilization of SIMD intrinsics in Rust 1.27 we can remove simd
feature from the blake2
crate.
This makes it really painful to include one of these types into my own struct as now I have to implement Debug
by hand. If the internals are being hidden for some reason, could the library provide an implementation that prints something like Sha256 { ... }
?
One of the nice thing of a project like this is that all crates share a common API. It would be nice to describe such API in the project README (and list exceptions, if any)
It looks like some crates want a D: Digest<OutputSize = U64> + Default
as an argument to what hash to use, so that VarBlake2b
is not very useful on it's own. Eg. dalek-cryptography/ed25519-dalek#59
I could write my own wrapper, but if I need it, other people will, too.
Thanks for the fantastic library! It's allowing me to give Rust a try for a small project.
I'm very new to Rust, so this may be a usage question rather than a feature request: I'm trying to start calculating a hash on one machine, and then continue the calculation on another. (Imagine the machines each have private input that must be added to the hash, for example.)
What's the best way to "serialize" and "deserialize" a Sha256
? Does this library offer a safe way, or can anyone recommend an unsafe one?
(My compile target is guaranteed to be the same for all machines: wasm32-unknown-unknown
. I'm trying to use wasm-bindgen
to provide functionality from this library to some JavaScript.)
Sacundim in his feedback on the reddit proposed to use instead of VariableOutput
the following traits:
ExtendableOutput
for functions like SHAKE from SHA-3, which allow to "read" indefinitely from the resultVariableOutput
for functions like Groestl
, which have some limits on the output size and may require output size to be known at state initalizationAlso @burdges proposed to move out BlockSize
from the Input
trait to a separate one.
I'm trying to figure out how to get a digest of a given size, and there is so many abstraction and crates involved, and the source is based on macros, obfuscating what is going on, that it's beyond me at this late hour.
I realize it's a lot of work, and don't want to just complain. I'm just trying to point out the issue. All these efforts are greatly appreciated!
See results of CI testing.
So this is something I'm hoping can be documented somewhere. I'm curious why the hashes derive copy because it seems like not the behavior you'd want. If you accidentally forget to pass your data by reference you will get an automatic copy which is not something you'd likely want in this case. Typically I've found if the impl
has mut
on self then you don't want copy.
e.g.
enum Foo {
Sha(sha2::Sha256),
Md5(md5::Md5),
}
struct Bar {
hash: Foo,
}
impl Bar {
fn some_data(&mut self, data: &[u8]) {
match self.hash {
Foo::Sha(mut hash) => hash.input(data),
Foo::Md5(mut hash) => hash.input(data),
}
}
}
What's the expected behavior here?
With test vectors from:
https://github.com/BLAKE2/BLAKE2
Please provide a constant for digest length size from each hash type. e.g.
https://github.com/mitsuhiko/rust-sha1/blob/9c41f8eac17b64ef4a3d5d483dc658a8b8491a19/src/lib.rs#L24
asm
)no_std
(see #36)"expected generic_array::GenericArray" in the call to compress256 and compress512, but slices of lenth 64/128 are passed to them.
GOSTs are specified in terms of N-bit integers and do not define byte order which should be used. Initially I've chose to use little endian order, but probably big endian would have been a better choice. This will allow to simplify code a bit, and will probably be more compatible with other implementations.
TODO: Look into other implementations.
In line 53 in lib.rs
, the function hash()
initialized the output array as [0; 16]
.
I believe this can be optimized to unsafe { ::std::mem::uninitialized()
}, as all the bytes in the output array are overwritten.
When computing many hashes per second in an environment where hash result matters (eg authenticating TCP segments with the MD5 signature option), this is just another little win.
Claimed by @gsgsingh93
Moved from #1:
I'm having some trouble with the types for Grostl. The issue is that the BlockSize depends on the OutputSize, and so I can't figure out to define the impl for Digest
. If the Output size is 256 bits or less, then 512 is used as the block size. If the output size is greater than 256 bytes, then 1024 is used as the block size. Here's what I have right now, with the incorrectly hardcoded BlockSize
as U512
extern crate digest;
extern crate generic_array;
use std::marker::PhantomData;
use digest::Digest;
use generic_array::{ArrayLength, GenericArray};
use generic_array::typenum::U512;
// TODO: This could also be U1024
type BlockSize = U512;
pub struct Grostl<OutputSize: ArrayLength<u8>> {
phantom: PhantomData<OutputSize>,
}
impl<OutputSize: ArrayLength<u8>> Grostl<OutputSize> {
fn new() -> Grostl<OutputSize> {
Grostl { phantom: PhantomData }
}
}
impl<OutputSize: ArrayLength<u8>> Default for Grostl<OutputSize> {
fn default() -> Self { Self::new() }
}
impl<OutputSize: ArrayLength<u8>> Digest for Grostl<OutputSize> {
type OutputSize = OutputSize;
type BlockSize = BlockSize;
fn input(&mut self, input: &[u8]) {
}
fn result(mut self) -> GenericArray<u8, Self::OutputSize> {
GenericArray::default()
}
}
Note that OutputSize
is parameterized here because Grostl can output hashes between 1 and 64 bytes.
EDIT: I guess if there's no fancy solution with the generics that could get this to work, I could always just override block_bytes
and block_bits
in the trait, and just set a dummy BlockSize
and not use it.
This is a violation of semver, the error being:
error[E0599]: no method named `variable_result` found for type `blake2::Blake2b` in the current scope
--> src/caps.rs:180:35
|
180 | let hash = hasher.variable_result(&mut buf).unwrap();
| ^^^^^^^^^^^^^^^
|
= note: the method `variable_result` exists but the following trait bounds were not satisfied:
`blake2::Blake2b : digest::ExtendableOutput`
`&blake2::Blake2b : digest::ExtendableOutput`
`&mut blake2::Blake2b : digest::ExtendableOutput`
The md5 documentation link links to the md5 crate instead of the md-5 crate documentation
Current implementation of Groestl is quite slow (<1 MB/s) and can be significantly improved.
Various performance improvement techniques can be found in the "Grøstl Implementation Guide". List of the fastest implementations can be found here.
The std::hash::Hasher
trait is not implemented for these digests. The trait is identical to the digest::Input
and digest::FixedOutput
traits. Implementing it would be useful for me, because I'd like to have my std::hash::Hash
implementation be usable with a digest::Digest
.
Compiling sha-1
with asm
feature enabled results in
error[E0308]: mismatched types
--> /Users/dignifiedquire/.cargo/registry/src/github.com-1ecc6299db9ec823/sha-1-0.8.0/src/lib.rs:80:54
|
80 | self.buffer.input(input, |d| compress(state, d));
| ^ expected array of 64 elements, found struct `block_buffer::generic_array::GenericArray`
|
= note: expected type `&[u8; 64]`
found type `&block_buffer::generic_array::GenericArray<u8, block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UTerm, block_buffer::generic_array::typenum::B1>, block_buffer::generic_array::typenum::B0>, block_buffer::generic_array::typenum::B0>, block_buffer::generic_array::typenum::B0>, block_buffer::generic_array::typenum::B0>, block_buffer::generic_array::typenum::B0>, block_buffer::generic_array::typenum::B0>>`
error[E0308]: mismatched types
--> /Users/dignifiedquire/.cargo/registry/src/github.com-1ecc6299db9ec823/sha-1-0.8.0/src/lib.rs:91:71
|
91 | self.buffer.len64_padding::<BE, _>(l, |d| compress(state, d));
| ^ expected array of 64 elements, found struct `block_buffer::generic_array::GenericArray`
|
= note: expected type `&[u8; 64]`
found type `&block_buffer::generic_array::GenericArray<u8, block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UTerm, block_buffer::generic_array::typenum::B1>, block_buffer::generic_array::typenum::B0>, block_buffer::generic_array::typenum::B0>, block_buffer::generic_array::typenum::B0>, block_buffer::generic_array::typenum::B0>, block_buffer::generic_array::typenum::B0>, block_buffer::generic_array::typenum::B0>>`
error: aborting due to 2 previous errors
It works fine without it.
Hi,
Any library which uses the constant_time_eq
crate is not no_std
. The source does not have a #![no_std]
annotation.
SHA-3 is just based on Keccak because it uses a slightly different padding function.
Maybe this should be corrected in the README but of course it's entirely your decision because the difference isn't that big.
Hi,
Am I using the XofReader
(Sha3XofReader
) incorrectly? I was assuming that each call to XofReader::read(...)
would extend the previous values.
For example, the following:
extern crate digest;
extern crate sha3;
use digest::{Input, ExtendableOutput ,XofReader};
use sha3::Shake256;
fn main() {
let mut hasher = Shake256::default();
hasher.process(b"some nice randomness here");
let mut xof = hasher.xof_result();
let mut buf = [0; 4];
for _ in 0..5 {
xof.read(&mut buf);
println!("{:?}", buf);
}
}
Repeatedly returns the same values.
[27, 145, 10, 182]
[27, 145, 10, 182]
[27, 145, 10, 182]
[27, 145, 10, 182]
[27, 145, 10, 182]
(Sorry if this is better placed in https://github.com/RustCrypto/traits, I figured it's probably an implementation bug, unless I've done something stupid.)
Currently most of the crates require 1.16 due to the use of Self
in structs, with small changes they can be made compatible with at least 1.13. It would be good to explicitly show those requirements in the algorithms table.
Either way it's worth to add relevant tests to Travis CI, so this table would stay updated.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.