rustcrypto / hashes Goto Github PK

View Code? Open in Web Editor NEW

1.8K 29.0 242.0 3.99 MB

Collection of cryptographic hash functions written in pure Rust

Rust 100.00%

cryptographic-hash-functions blake2 gost streebog groestl md2 md4 md5 sha1 sha2

hashes's People

Contributors

Stargazers

Watchers

Forkers

felipeamp faineance cseale tarcieri burdges livinginsyn quininer tyrel86 nau kazcw nikmikov jmcomets lukw00heck alippai tempbottle themadprofessor rafaelescrich connors2015 sourcefrog magnet dpc tesuji fulmicoton pebble8888 spebern kornelski str4d 3for bruceherve felixrabe flier mesalock-linux ignatenkobrain oconnor663 md-levitan palem1988 mimirmim erasmospunk qc16 sakridge prismaphonic dignifiedquire jack-fortanix kaimast makotokato phr0ze hansl linkmauve diemyst marcelo140 faisalcharolia supython-coder vamanea rukai yihuang cryptonemo allchain iquerejeta mizukyf divergentdave mars-research corelibcore doublespending pombredanne lxiange 0x00a5 tumtum23 dgbo cperezz not-an-aardvark myers stephaneworkspace abonander jack-signal icodein lightsalt2011 uudiin theo-lw ineiti gavadinov striezel-stash ageldama ioavv r3bu1ld3r tlercher aumetra jeremybanks victor-mazzei mobilecoinfoundation polymeshassociation komodoplatform jiazhang0 denis2glez rexagon tomerdbz akshay-deepsource hustliyilin pierwill bradyjoestar woodharter

hashes's Issues

What's the best `Digest` trait?

Both the rust-crypto and digest crates have a Digest trait. They're both listed as being authored by "The Rust-Crypto Project Developers".

If I'm defining a new hash algorithm, which trait should I implement?

Missing hash functions

List of "would be nice to have" hash functions:

It can be changed based on discussion.

Please add SIDH encryption

Encryption that is resistant to Quantum computers. Microsoft's open source implementation:
https://www.microsoft.com/en-us/research/project/sidh-library/

Proposal: Add a DigestBuilder

Instead of this:

let foo = {
    let mut sh = Sha1::default();
    sh.input(&bar);
    sh.input(&baz);
    sh.input(&buz);
    sh.result()
};

It would allow for this:

let foo = Sha1::builder()
    .input(&bar)
    .input(&baz)
    .input(&buz)
    .result()

This could be added entirely in the digest crate.

digest_reader incorrect behaviour on short read

in digest/src/digest.rs line 72

if bytes_read != buffer.len() {

that will terminate on short read, which breaks with asynchronous Read, which may not have the full buffer at the current time, but will have more later.
it should terminate when bytes_read == 0 instead (EOF)

see https://doc.rust-lang.org/std/io/trait.Read.html#tymethod.read

Digest trait cannot be made into an object

Hey, I'm very new to Rust so I apologize if this turns out to be a language misunderstanding rather than a package issue...

I'm trying to dynamically create instances of algorithm structs that implement the Digest trait, hold on to them using a Box<Digest>, and interact with them via the Digest interface.

In other words, something like this:

let hasher = match "one of the algorithms" {
    "sha256" => Box::new(Sha256::new()) as Box<Digest>,
    "sha512" => Box::new(Sha512::new()) as Box<Digest>
    // ... etc
};

I sort of get why that doesn't work (missing associated types), but even if I fill those in like:

"sha256" => Box::new(Sha256::new()) as Box<Digest<<OutputSize=U32, BlockSize=U64>>>,

I'm still left with an error: the trait 'digest::Digest' cannot be made into an object.

After poking around it seemed like this might be because of the way Digest was implemented? Looking at the docs it doesn't seem to have any static methods, but maybe it does or there's something else I'm missing?

Again, sorry to bother you with this!

Key stretching to 64B.

Hi,

I need to use blake2 for key stretching to a 64B value. Is VariableOutput the way to go or am I misunderstanding its purpose? Is effectively making me use BLAKE-512?

Thank you for your help!

GOST94 optimization

It's possible to significantly improve performance of GOST94 with a relatively little work.

perf report results:

--98.25%-- gost94::gost94::Gost94State::f::h77646fa46d6b3d1b
     |
     |--45.54%-- gost94::gost94::Gost94State::shuffle::hdd9f6ead0dcb2230
     |     |
     |     |--42.68%-- gost94::gost94::psi::h3fd72e14f2fd3fca
     |
     |--35.82%-- gost94::gost94::encrypt::h32d61af8e91b44b7
     |     |
     |     |--29.97%-- gost94::gost94::g::ha2fe1802e6e6b67f
     |          |
     |          |--26.89%-- gost94::gost94::sbox::hd99689d5581a89bb
     |
     |--8.02%-- gost94::gost94::a::h8a2cac7c5b28e727
     |
     |--3.59%-- gost94::gost94::p::h158268fffc4616a5
     |
     |--1.97%-- gost94::gost94::g::ha2fe1802e6e6b67f

sbox can be optimized using 8-bit S-boxes instead of 4-bit ones. Additionally you can read this paper (Russian) about possible optimizations of block cipher used in GOST94.

psi could be optimized by replacing unnecessary copy operations with cursor based approach.

Performance of blake2

Hi,

It looks like performance of blake2 compared to sha512 is very similar (while still faster):

     Running target/release/deps/blake2b-2fc6bb02e2d63899

running 6 tests
test bench_16  ... bench:          45 ns/iter (+/- 6) = 355 MB/s
test bench_1k  ... bench:       2,636 ns/iter (+/- 3) = 388 MB/s
test bench_256 ... bench:         685 ns/iter (+/- 31) = 373 MB/s
test bench_64  ... bench:         170 ns/iter (+/- 9) = 376 MB/s
test bench_64k ... bench:     174,027 ns/iter (+/- 3,512) = 376 MB/s
test bench_8k  ... bench:      21,669 ns/iter (+/- 951) = 378 MB/s

running 6 tests
test bench_16  ... bench:          60 ns/iter (+/- 1) = 266 MB/s
test bench_1k  ... bench:       3,092 ns/iter (+/- 101) = 331 MB/s
test bench_256 ... bench:         779 ns/iter (+/- 15) = 328 MB/s
test bench_64  ... bench:         204 ns/iter (+/- 9) = 313 MB/s
test bench_64k ... bench:     197,618 ns/iter (+/- 8,180) = 331 MB/s
test bench_8k  ... bench:      24,689 ns/iter (+/- 896) = 331 MB/s

But https://blake2.net/ shows that performance should be about 3x faster.

Any ideas? It looks like blake2 does not use SIMD? Is it not needed? Is there a chance that blake2 crate here will be optimized better in future? Or it's just because recent processors already execute sha512 much faster?

Implement 128 and 256-bit versions of RIPEMD

Currently only RIPEMD-160 is implemented. It should be relatively easy to implement 128, 256 and ~~320 bit~~ (added in #68) versions of algorithm based on 160 bit implementation.

Add asm feature tests to CI

asm feature was added to md5, sha1, sha2 and whirlpool crates, but current CI configuration does not test it.

Shake output should be variable length

Sha3's shake mode should provides variable length output determined at runtime, but it's currently a user defined type level numeric. I'd think the OutputSize type of the Digest trait should be replaced by a Output type to fix this, but I have no looked into doing it.

Sha2 became slower

Hi,

I use sha256 in vagga a lot. And I just noticed that current implementation of sha256 is much slower (2x-3x) than old sha2 = 0.1.2.

I do not post benchmarks here because I can't find repository for old code. Do you have a link?

Is there any chance to fix it?

Link error in SHA2 with asm on macOS

When I compile sha2 version 0.7.1 with the asm feature enabled, I get the following link error:

  = note: Undefined symbols for architecture x86_64:
            "_sha512_compress", referenced from:
                sha2_asm::compress512::h2714fa0e6f190002 in libsha2-128ed50e2e0d1cae.rlib(sha2-128ed50e2e0d1cae.sha214.rcgu.o)
          ld: symbol(s) not found for architecture x86_64
          clang: error: linker command failed with exit code 1 (use -v to see invocation)

This is on macOS 10.13.4 with Rust 1.26.

To reproduce, run cargo new --bin test_sha2 and then edit the following files:

Cargo.toml:

[package]
name = "test_sha2"
version = "0.1.0"

[dependencies]
sha2 = { version = "0.7", features = ["asm"] }

src/main.rs:

extern crate sha2;

use sha2::{Sha512Trunc256, Digest};

fn main() {
    let mut hasher = Sha512Trunc256::new();
    let data = b"Hello world!";
    hasher.input(data);
    // `input` can be called repeatedly
    hasher.input("String data".as_bytes());
    // Note that calling `result()` consumes hasher
    let hash = hasher.result();
    println!("Result: {:x}", hash);
}

Add security summaries

It would be nice to have a short security summary in each crate documentation which would include known insecurities, applicability, etc.

Use Intel SHA extensions intrinsics

SHA extensions allow to significantly speed-up computations using hardware acceleration and they will be soon available on stable.

Traits for digests of particular output length

Currently the Digest trait allows writing code that's generic with respect to the choice of hash function. But for an application that requires a certain output size, there's no way to encode that requirement into the type system (e.g., "This function requires a hash function with 512 bits of output").

Perhaps one way to solve this would be to introduce empty Digest256, Digest512, etc traits, which all inherit the Digest trait. Implementing, say, Digest512 is then a declaration that that hash function has 512 bits of output, so that users can write functions which are generic over hash functions with a fixed output size.

Is this a good idea, or there a better way to do this?

There should be a standard way to recusively hash structs

std::hash::Hasher can be derived for structs and is a standard hashing interface in rust. The standard interface only allows 64bit outputs but there's nothing stopping extra outpust tailored to specific hashes. So for ergonomic purposes wouldn't it make sense to have an adapter to allow using the Hasher API?

Add big endiann architecture tests to CI

Using either qemu or https://github.com/japaric/trust.

Package for sha2 contains binary tests

Tested with 0.7.1 (I cannot test with 0.8.0 atm).

sha2/tests/data/sha224/test1.output.bin
sha2/tests/data/sha224/test2.output.bin
sha2/tests/data/sha224/test3.output.bin
sha2/tests/data/sha256/one_million_a.output.bin
sha2/tests/data/sha256/test1.output.bin
sha2/tests/data/sha256/test2.output.bin
sha2/tests/data/sha256/test3.output.bin
sha2/tests/data/sha384/test1.output.bin
sha2/tests/data/sha384/test2.output.bin
sha2/tests/data/sha384/test3.output.bin
sha2/tests/data/sha512/test1.output.bin
sha2/tests/data/sha512/test2.output.bin
sha2/tests/data/sha512/test3.output.bin
sha2/tests/data/sha512_224/test1.output.bin
sha2/tests/data/sha512_224/test3.output.bin
sha2/tests/data/sha512_256/test1.output.bin

I assume that this is an error. At least, it gets flagged as "odd" by the Firefox submission mechanism.

Implementations of BLAKE2sp and BLAKE2bp

Apparently BLAKE2 includes parallel versions of the two hash functions to take advantage of multi-core:

https://github.com/BLAKE2/BLAKE2/blob/master/sse/blake2sp.c

It would be nice to have these to be able to take advantage of more cores when hashing large files.

Extend `Digest` trait to support inputting data from a `Read`-able object

The current Digest trait looks like this:

pub trait Digest: Input + FixedOutput {
    type OutputSize: ArrayLength<u8>;
    type BlockSize: ArrayLength<u8>;
    fn input(&mut self, input: &[u8]);
    fn result(self) -> GenericArray<u8, Self::OutputSize>;
}

A common pattern, at least for me, is to take a hash for an entire file. This can be done something like:

let file = File::open(...).unwrap();
let mut hasher = Sha256::default();

// Read blocks of the file into memory and pass them into the hasher.
let mut buffer: [u8; 512] = Default::default();
loop {
    let bytes_read = file.read(&mut buffer)?;
    hasher.input(&buffer[..bytes_read]);
    if bytes_read != buffer.len() {
        break;
    }
}

let hash = hasher.result();

I think this pattern may be common enough to justify extending the Digest trait with a read_from method, and providing a default implementation that is used across all implementors:

pub trait Digest: Input + FixedOutput {
    fn read_from(&mut self, from: &mut std::io::Read) -> Result<(), std::io::Error> {
        // Read blocks into memory until EOF (or io error), and pass into the `input` function.
    }
}

This would replace the middle 8 lines I wrote in the above example with what in my opinion is a more ergonomic function call, and it generalizes to any type that implements std::io::Read (e.g. sockets or Tcp streams).

Does this seem like a good idea? I'd be happy to implement it if those more invested in RustCrypto than myself approve.

Add k12

I wrote a KangarooTwelve implementation. It's slow and has horrible code style, but passes the tests at least. Licenced CC0 so feel free to copy.

https://github.com/dhardy/hash-bench/blob/master/src/k12.rs

Compilation error when used as a dependency

When trying to install tectonic which depends on this:

error: `Self` and associated types in struct expressions and patterns are unstable (see issue #37544)
  --> ~/.cargo/registry/src/github.com-1ecc6299db9ec823/md-5-0.4.3/src/lib.rs:40:9
   |
40 |         Self {
   |         ^^^^

Though, strangely, the Travis CI build is passing.

Port blake2 on coresimd

With stabilization of SIMD intrinsics in Rust 1.27 we can remove simd feature from the blake2 crate.

Sha256 (and seemingly most others) do not implement Debug

This makes it really painful to include one of these types into my own struct as now I have to implement Debug by hand. If the internals are being hidden for some reason, could the library provide an implementation that prints something like Sha256 { ... }?

Add a general usage introduction to the README

One of the nice thing of a project like this is that all crates share a common API. It would be nice to describe such API in the project README (and list exceptions, if any)

Add `Blake2b-256`, `Blake2b-128`, etc. as wrappers over `VarBlake2b`.

It looks like some crates want a D: Digest<OutputSize = U64> + Default as an argument to what hash to use, so that VarBlake2b is not very useful on it's own. Eg. dalek-cryptography/ed25519-dalek#59

I could write my own wrapper, but if I need it, other people will, too.

Continuing a hashing computation on another machine

Thanks for the fantastic library! It's allowing me to give Rust a try for a small project.

I'm very new to Rust, so this may be a usage question rather than a feature request: I'm trying to start calculating a hash on one machine, and then continue the calculation on another. (Imagine the machines each have private input that must be added to the hash, for example.)

What's the best way to "serialize" and "deserialize" a Sha256? Does this library offer a safe way, or can anyone recommend an unsafe one?

(My compile target is guaranteed to be the same for all machines: wasm32-unknown-unknown. I'm trying to use wasm-bindgen to provide functionality from this library to some JavaScript.)

Next iteration of Digest traits

Sacundim in his feedback on the reddit proposed to use instead of VariableOutput the following traits:

ExtendableOutput for functions like SHAKE from SHA-3, which allow to "read" indefinitely from the result
VariableOutput for functions like Groestl, which have some limits on the output size and may require output size to be known at state initalization

Also @burdges proposed to move out BlockSize from the Input trait to a separate one.

Please add more inline examples in the documentation.

I'm trying to figure out how to get a digest of a given size, and there is so many abstraction and crates involved, and the source is based on macros, obfuscating what is going on, that it's beyond me at this late hour.

I realize it's a lot of work, and don't want to just complain. I'm just trying to point out the issue. All these efforts are greatly appreciated!

Groestl works incorrectly on big-endian architecture

See results of CI testing.

why do the hashes derive copy

So this is something I'm hoping can be documented somewhere. I'm curious why the hashes derive copy because it seems like not the behavior you'd want. If you accidentally forget to pass your data by reference you will get an automatic copy which is not something you'd likely want in this case. Typically I've found if the impl has mut on self then you don't want copy.

e.g.

enum Foo {
   Sha(sha2::Sha256),
   Md5(md5::Md5),
}

struct Bar {
  hash: Foo,
}

impl Bar { 
  fn some_data(&mut self, data: &[u8]) {
    match self.hash {
      Foo::Sha(mut hash) => hash.input(data),
      Foo::Md5(mut hash) => hash.input(data),
   }
  }
}

What's the expected behavior here?

Add more blake2 tests

With test vectors from:
https://github.com/BLAKE2/BLAKE2

provide a constant for digest length size

Please provide a constant for digest length size from each hash type. e.g.

https://github.com/mitsuhiko/rust-sha1/blob/9c41f8eac17b64ef4a3d5d483dc658a8b8491a19/src/lib.rs#L24

CI improvements

Add crate specific feature tests (including asm)
Add big-endian tests (using either qemu or https://github.com/japaric/trust.) [#40]
Test minimal Rust versions for crates (currently all crates support 1.13 or higher)
Add tests to check if crates are truly no_std (see #36)

sha2 (at least) doesn't build with asm

"expected generic_array::GenericArray" in the call to compress256 and compress512, but slices of lenth 64/128 are passed to them.

Use big endian for streebog and gost94

GOSTs are specified in terms of N-bit integers and do not define byte order which should be used. Initially I've chose to use little endian order, but probably big endian would have been a better choice. This will allow to simplify code a bit, and will probably be more compatible with other implementations.

TODO: Look into other implementations.

MD5 performance optimization

In line 53 in lib.rs, the function hash() initialized the output array as [0; 16].

I believe this can be optimized to unsafe { ::std::mem::uninitialized() }, as all the bytes in the output array are overwritten.

When computing many hashes per second in an environment where hash result matters (eg authenticating TCP segments with the MD5 signature option), this is just another little win.

Add Grostl

Claimed by @gsgsingh93

Moved from #1:

I'm having some trouble with the types for Grostl. The issue is that the BlockSize depends on the OutputSize, and so I can't figure out to define the impl for Digest. If the Output size is 256 bits or less, then 512 is used as the block size. If the output size is greater than 256 bytes, then 1024 is used as the block size. Here's what I have right now, with the incorrectly hardcoded BlockSize as U512

extern crate digest;
extern crate generic_array;

use std::marker::PhantomData;

use digest::Digest;
use generic_array::{ArrayLength, GenericArray};
use generic_array::typenum::U512;

// TODO: This could also be U1024
type BlockSize = U512;

pub struct Grostl<OutputSize: ArrayLength<u8>> {
    phantom: PhantomData<OutputSize>,
}

impl<OutputSize: ArrayLength<u8>> Grostl<OutputSize> {
    fn new() -> Grostl<OutputSize> {
        Grostl { phantom: PhantomData }
    }
}

impl<OutputSize: ArrayLength<u8>> Default for Grostl<OutputSize> {
    fn default() -> Self { Self::new() }
}

impl<OutputSize: ArrayLength<u8>> Digest for Grostl<OutputSize> {
    type OutputSize = OutputSize;
    type BlockSize = BlockSize;

    fn input(&mut self, input: &[u8]) {
    }

    fn result(mut self) -> GenericArray<u8, Self::OutputSize> {
        GenericArray::default()
    }
}

Note that OutputSize is parameterized here because Grostl can output hashes between 1 and 64 bytes.

EDIT: I guess if there's no fancy solution with the generics that could get this to work, I could always just override block_bytes and block_bits in the trait, and just set a dummy BlockSize and not use it.

0.6.1 release breaks code depending on 0.6.0

This is a violation of semver, the error being:

error[E0599]: no method named `variable_result` found for type `blake2::Blake2b` in the current scope                  
   --> src/caps.rs:180:35                                  
    |                                                                                                                  
180 |                 let hash = hasher.variable_result(&mut buf).unwrap();                                            
    |                                   ^^^^^^^^^^^^^^^    
    |                                                                                                                  
    = note: the method `variable_result` exists but the following trait bounds were not satisfied:                     
            `blake2::Blake2b : digest::ExtendableOutput`   
            `&blake2::Blake2b : digest::ExtendableOutput`                                                              
            `&mut blake2::Blake2b : digest::ExtendableOutput`

MD5 Documentation link refers to wrong crate

The md5 documentation link links to the md5 crate instead of the md-5 crate documentation

Groestl perfomance

Current implementation of Groestl is quite slow (<1 MB/s) and can be significantly improved.

Various performance improvement techniques can be found in the "Grøstl Implementation Guide". List of the fastest implementations can be found here.

Hasher trait not implemented for digests

The std::hash::Hasher trait is not implemented for these digests. The trait is identical to the digest::Input and digest::FixedOutput traits. Implementing it would be useful for me, because I'd like to have my std::hash::Hash implementation be usable with a digest::Digest.

Sha-1 asm feature is broken

Compiling sha-1 with asm feature enabled results in

error[E0308]: mismatched types
  --> /Users/dignifiedquire/.cargo/registry/src/github.com-1ecc6299db9ec823/sha-1-0.8.0/src/lib.rs:80:54
   |
80 |         self.buffer.input(input, |d| compress(state, d));
   |                                                      ^ expected array of 64 elements, found struct `block_buffer::generic_array::GenericArray`
   |
   = note: expected type `&[u8; 64]`
              found type `&block_buffer::generic_array::GenericArray<u8, block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UTerm, block_buffer::generic_array::typenum::B1>, block_buffer::generic_array::typenum::B0>, block_buffer::generic_array::typenum::B0>, block_buffer::generic_array::typenum::B0>, block_buffer::generic_array::typenum::B0>, block_buffer::generic_array::typenum::B0>, block_buffer::generic_array::typenum::B0>>`

error[E0308]: mismatched types
  --> /Users/dignifiedquire/.cargo/registry/src/github.com-1ecc6299db9ec823/sha-1-0.8.0/src/lib.rs:91:71
   |
91 |             self.buffer.len64_padding::<BE, _>(l, |d| compress(state, d));
   |                                                                       ^ expected array of 64 elements, found struct `block_buffer::generic_array::GenericArray`
   |
   = note: expected type `&[u8; 64]`
              found type `&block_buffer::generic_array::GenericArray<u8, block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UInt<block_buffer::generic_array::typenum::UTerm, block_buffer::generic_array::typenum::B1>, block_buffer::generic_array::typenum::B0>, block_buffer::generic_array::typenum::B0>, block_buffer::generic_array::typenum::B0>, block_buffer::generic_array::typenum::B0>, block_buffer::generic_array::typenum::B0>, block_buffer::generic_array::typenum::B0>>`

error: aborting due to 2 previous errors

It works fine without it.

Libraries which use `constant_time_eq` are not `no_std`

Hi,

Any library which uses the constant_time_eq crate is not no_std. The source does not have a #![no_std] annotation.

Technically Keccak ≠ SHA-3

SHA-3 is just based on Keccak because it uses a slightly different padding function.

Maybe this should be corrected in the README but of course it's entirely your decision because the difference isn't that big.

Using XofReader

Hi,
Am I using the XofReader (Sha3XofReader) incorrectly? I was assuming that each call to XofReader::read(...) would extend the previous values.

For example, the following:

extern crate digest;
extern crate sha3;

use digest::{Input, ExtendableOutput ,XofReader};
use sha3::Shake256;

fn main() {
    let mut hasher = Shake256::default();
    hasher.process(b"some nice randomness here");
    let mut xof = hasher.xof_result();

    let mut buf = [0; 4];

    for _ in 0..5 {
        xof.read(&mut buf);
        println!("{:?}", buf);
    }
}

Repeatedly returns the same values.

[27, 145, 10, 182]
[27, 145, 10, 182]
[27, 145, 10, 182]
[27, 145, 10, 182]
[27, 145, 10, 182]

(Sorry if this is better placed in https://github.com/RustCrypto/traits, I figured it's probably an implementation bug, unless I've done something stupid.)

VariableOutput implementation is incorrect for BLAKE2

As was noticed by @Ralith VariableOutput is currently incorrectly implemented for BLAKE2, as output size must be included in the parameters block.

Determine minimal Rust version requirements for crates

Currently most of the crates require 1.16 due to the use of Self in structs, with small changes they can be made compatible with at least 1.13. It would be good to explicitly show those requirements in the algorithms table.

Either way it's worth to add relevant tests to Travis CI, so this table would stay updated.