Git Product home page Git Product logo

Comments (38)

cmuratori avatar cmuratori commented on May 20, 2024 1

Thank you very much for this analysis! As you say, I'm not sure we would want to slow the hash down to improve collision resistance of this kind, but it is very helpful to know where the vulnerabilities are, both so we can include them with the distribution (so people understand not to use it for security - even though I say that, it is helpful much more persuasive to prove it by example!), and so we can know where we are likely to fail on data hashing.

- Casey

from meow_hash.

NoHatCoder avatar NoHatCoder commented on May 20, 2024 1

Here is another fun one: A file filled with 512 "o"s, and a file filled with 512 "p"s. There is no inherent symmetry-breaker in an AES round, if all the input bytes are the same, and all the key bytes are the same, then all the output bytes will be the same. It might be an idea to seed the hash with something other than zeroes.

from meow_hash.

NoHatCoder avatar NoHatCoder commented on May 20, 2024 1

Yes, we can always generate some collisions, but for a 128 bit hash you'd generally expect that the Kolmogorov complexity of a collision is approximately 128 bits, if we had seeded each stream with 128 pseudorandom bits you'd generally need 128 arbitrary bits to "undo" that seeding. I found this collision just by generating 256 hashes, if the seed hadn't been all zeroes I would by the same method need 2^128 hashes to get two in a row that hash to the same value. Though I could of course use some maths to get around the brute-force-method, I'd still need to present a series of 128 scrambled bits as the collision.

from meow_hash.

NoHatCoder avatar NoHatCoder commented on May 20, 2024 1

Throwing in just a single set bit in each stream would actually do a lot, since that gets spread to 32 bits before the first data is xored into the stream. Ideally you should put different non-zeroes in the first 4 bytes of each stream, and a different pattern for each stream, that would give you a complete set of scrambled data for the first xor. Something like 0x01020304 + [stream no] * 0x04040404 would be perfect.

from meow_hash.

dgryski avatar dgryski commented on May 20, 2024 1

Siphash uses the ASCII bytes for "some peudo randomly generated bytes"

0x736f6d6570736575
0x646f72616e646f6d
0x6c7967656e657261
0x7465646279746573

from meow_hash.

cmuratori avatar cmuratori commented on May 20, 2024

Is there an obvious better pattern to seed with? I guess the only confusion here is that if the AES rounds themselves don't have anything that deals with symmetry, then why would seeding change that? Doesn't that just change which set of repeating data will hash the same?

- Casey

from meow_hash.

cmuratori avatar cmuratori commented on May 20, 2024

Do you have a recommendation for what the hash should be seeded with? Is it just "any random series of bits you choose"? I would be happy to see about seeding it with something other than zeroes. I can take a look to see how much that would cost in terms of execution time on small keys.

- Casey

from meow_hash.

cmuratori avatar cmuratori commented on May 20, 2024

Thanks Jacob - I will take a look at that and see if it's feasible to do some better initialization without affecting the performance for smaller inputs.

- Casey

from meow_hash.

cmuratori avatar cmuratori commented on May 20, 2024

Jacob, I'm going to upload a v0.4 candidate today (to the 0.4 branch). It includes loading constants into the streams at the beginning, but the constants are currently just ascending integers. Once it's posted, would you be up for playing around with it and suggesting what the "optimal" constants to load might be?

- Casey

from meow_hash.

cmuratori avatar cmuratori commented on May 20, 2024

OK, @NoHatCoder, I've uploaded a v0.4 branch that has non-zero initialization. You can see the constants used for initialization here:

https://github.com/cmuratori/meow_hash/blob/v0.4/meow_hash.h#L135

They're just nonsense right now, but they can be set to whatever would make the most sense. Let me know if you have any ideas on what the "right" values would be?

Thanks,
- Casey

from meow_hash.

ratchetfreak avatar ratchetfreak commented on May 20, 2024

in general when such pre-init is needed in crypto they use nothing-up-my-sleeve numbers. That way then can have numbers with roughly 50% of bits set with a non-trivial pattern.

from meow_hash.

cmuratori avatar cmuratori commented on May 20, 2024

Amusingly, what I randomly decided to put in there temporarily is actually mentioned in that Wiki page :) It's the "KASUMI" init! Long live 123456789...

- Casey

from meow_hash.

NoHatCoder avatar NoHatCoder commented on May 20, 2024

I don't believe it matters much, the algorithm will scramble this setup to perfectly good pseudorandom bytes. My only real comment on this is that you could get away with initialising just the first 4 bytes of each stream, and leaving the rest as zeroes. Would that actually be faster? I don't know, but it is a possibility.

from meow_hash.

cmuratori avatar cmuratori commented on May 20, 2024

A good question. I will have to see if maybe there is a way to get an immediate load in there that could be faster...

- Casey

from meow_hash.

cmuratori avatar cmuratori commented on May 20, 2024

I'm not seeing any obvious way to do something smarter here. We could load just 32 bits per init, and "use less cache" than loading 128 bits for each lane, but I'm not sure that's particularly interesting either way...

- Casey

from meow_hash.

cmuratori avatar cmuratori commented on May 20, 2024

So here is another thought, and I'm not sure how useful it is, but we could alternatively not add any additional data by just loading into the lanes the swizzle vector used by the end-of-buffer code... this would allow us to initialize each lane with a different 128-bit section out of the swizzle buffer. That may be bad for initialization, I have no idea, but it would be the "minimal data" for the algorithm since it's just reusing something it already has to have anyway:

https://github.com/cmuratori/meow_hash/blob/v0.4/meow_hash.h#L132

We could, for example, initialize the bottom 32 bits of each lane by loading 0x00010203 into the first lane, 0x04050607 into the second lane, 0x08091011 into the third, and 0x12131415 into the fourth, and that would avoid using any additional data space to implement the algorithm... would that be sufficient?

- Casey

from meow_hash.

erthink avatar erthink commented on May 20, 2024

Yes, we can always generate some collisions, but for a 128 bit hash you'd generally expect that the Kolmogorov complexity of a collision is approximately 128 bits, if we had seeded each stream with 128 pseudorandom bits you'd generally need 128 arbitrary bits to "undo" that seeding. I found this collision just by generating 256 hashes, if the seed hadn't been all zeroes I would by the same method need 2^128 hashes to get two in a row that hash to the same value. Though I could of course use some maths to get around the brute-force-method, I'd still need to present a series of 128 scrambled bits as the collision.

@NoHatCoder, If you dig deeper, you can see a more general problem of MeowHash.
Single AES-round is not enough to make a good entropy sponge, since the avalanche is too low.
In the other words -- probability of collision is too high and both of collision and preimage attacks (relatively) are too easy. Moreover, cross-lanes diffusion should be provided for properly hashing.
This could be revealed by Yves's Orton SMHasher, but some code changes are required.

@demerphq, @rurban, @Bulat-Ziganshin, FYI.

from meow_hash.

demerphq avatar demerphq commented on May 20, 2024

from meow_hash.

NoHatCoder avatar NoHatCoder commented on May 20, 2024

@cmuratori Yeah, that is fine.

@leo-yuriev I don't believe this hash is meant to resist any form of adversarial attack, it just has to not collide given non-malicious input. That said, I'm not sure in that case why the function would feature a seed at all.

from meow_hash.

cmuratori avatar cmuratori commented on May 20, 2024

@NoHatCoder @leo-yuriev Yes, this is a non-cryptographic hash and should be assumed to be much easier to attack than a cryptographic hash.

That said, the idea behind making changes that lead to stronger collision resistance (such as improved seeding) is that this will help it collide less in general. So I think it would still be a good idea to have things such as a better smhasher, or cryptographic analysis of Meow that explains the weaknesses, because the more we can understand them, the better we can tune the function to have less collisions in general.

Stated alternately, it is always easy enough for us to say "no, we don't care about that weakness because we don't think fixing it is worth the speed trade off," so it seems strictly better to at least find out what the weaknesses are and then ignore them as necessary, rather than simply saying we don't want to know at all :)

For AES rounds in particular, though, I would like to see more of a complete analysis of why you need multiple AES rounds per ingress. Yes, it takes more than one AES round to diffuse bits, but all input in Meow does go through multiple AES rounds eventually. So the thing that I have not seen is a coherent explanation of why "some AES rounds are better than others" in this regard, in the sense that somehow doing an AES round with the input and then some ones with a seed (or something else) is some how magically better than just doing AES rounds on inputs only. A pointer to a paper that discussed this in detail and showed the difference would be very helpful.

- Casey

from meow_hash.

dgryski avatar dgryski commented on May 20, 2024

For another non-cryptographic hash using AES instructions, check out the Go runtime's internal hash table hash function:

https://golang.org/src/runtime/asm_amd64.s#L870

from meow_hash.

cmuratori avatar cmuratori commented on May 20, 2024

There are several non-cryptographic AES hashes (Falk is in our standard benchmark, for example), and there are also multiple cryptographic hashes built from AES (Cheetah Hash was a SHA-3 entrant, for example).

- Casey

from meow_hash.

erthink avatar erthink commented on May 20, 2024

@NoHatCoder, @cmuratori, speaking of attacks, I didn't mean that there should be any significant resistance against ones, but often I see how thinking from the an attacker's point helps to understand the weakness of some solution (since I work for an information security company).

@NoHatCoder @leo-yuriev Yes, this is a non-cryptographic hash and should be assumed to be much easier to attack than a cryptographic hash.

Yes, certainly. However, in the case of MeowHash, I assumed that such attack(s) could be much easier even than you expect.

That said, the idea behind making changes that lead to stronger collision resistance (such as improved seeding) is that this will help it collide less in general. So I think it would still be a good idea to have things such as a better smhasher, or cryptographic analysis of Meow that explains the weaknesses, because the more we can understand them, the better we can tune the function to have less collisions in general.

Yes of cause.

Stated alternately, it is always easy enough for us to say "no, we don't care about that weakness because we don't think fixing it is worth the speed trade off," so it seems strictly better to at least find out what the weaknesses are and then ignore them as necessary, rather than simply saying we don't want to know at all :)

Yes, yet again. But such trade-offs are really hard to measure and articulate in reasonable way.
How much of weakness is acceptable per item gain of speed?
The answer is 42 ;)

For AES rounds in particular, though, I would like to see more of a complete analysis of why you need multiple AES rounds per ingress. Yes, it takes more than one AES round to diffuse bits, but all input in Meow does go through multiple AES rounds eventually. So the thing that I have not seen is a coherent explanation of why "some AES rounds are better than others" in this regard, in the sense that somehow doing an AES round with the input and then some ones with a seed (or something else) is some how magically better than just doing AES rounds on inputs only. A pointer to a paper that discussed this in detail and showed the difference would be very helpful.

A complete and accurate answer will require considerable work both for articulate and understand. But I think it is not necessary now.

Briefly:

  • MeowHash uses the 16-channel Merkle–Damgård construction with "AES decode round" as F(x), and final compression (16 channels to result).
  • But it is NOT an "Wide pipe" or "Fast wide pipe" construction.
  • Let assume we are not interested in the possibility of attacks, but we want to assess only the probability of collisions.
  • Then we can focus only on estimating the avalanche effect for F(x), i.e. for an one Rijndael round.

Here a much can be simplified, because an each ingress lane is always combines with only an one intermediate state lane of the equal size.

  • The probability of collisions for the whole construction will correspond to the properties of a good (but not strong) hash function only if F(x) meets the strict avalanche criterion (SAC).
  • Otherwise, in the such simple construction, F(x) will lose the entropy of an input data at each step and the collisions probability increases proportionally to these losses.
  • Known SAC value for single Rijndael round is 20 x2 bit (15.0% x2), i.e. 20 bits will flip if one input bit would changed. This means that on average F(x) collects only 40 bits of entropy out of every 128.

Also known than a full SAC cannot be achieved in less than 3 rounds. You can google corresponding publications/papers. Instantly I found this one.

P.S.
As author of the t1ha family I should note that current t1ha0() variants with AES-NI also includes such weakness. This is the main reason for redesign, which almost done for now.

from meow_hash.

erthink avatar erthink commented on May 20, 2024

Patches welcome to my version of smhasher any time you like! I accept and
encourage pull requests.

@demerphq, I think SMhasher requires an extended "birthday test", which should running until the number of collisions will reached a value corresponds evaluate a property of hash with the given confidence. Unfortunately, I don't have time to do it (including HashContest, which I announced in the spring).

from meow_hash.

cmuratori avatar cmuratori commented on May 20, 2024

@leo-yuriev So, is what you're saying that HashContest is also not happening due to schedule constraints? That is unfortunate :(

- Casey

from meow_hash.

erthink avatar erthink commented on May 20, 2024

@leo-yuriev So, is what you're saying that HashContest is also not happening due to schedule constraints? That is unfortunate :(

for accuracy:

  • HashContest is not a hash function, but a test suite (which I hope to implement).
  • t1ha is the family of hash functions, nowadays actually t1ha2(), t1ha1() and t1ha0(). Where t1ha0() is the weakest.
  • only t1ha0() have the variants with AES-NI, which have collision weakness that discussed.
  • t1ha0() not frozen that allows me create a new design, which seems will be a cut above by the quality.

from meow_hash.

cmuratori avatar cmuratori commented on May 20, 2024

Yes, I know - I was hoping HashContest was going to be a useful test suite.

- Casey

from meow_hash.

mmozeiko avatar mmozeiko commented on May 20, 2024

For another non-cryptographic hash using AES instructions, check out the Go runtime's internal hash table hash function:

https://golang.org/src/runtime/asm_amd64.s#L870

Nice find!
I've looked into it, and it looks very similar to Meow hash. Here are differences:

  1. uses AES encrypt, not decrypt operation

  2. funky way how to create initial state/seed - 64 lowest bits are used from user seed. Next 16-bits are lowest bits from length and those are replicated across rest of 16-bits. EDIT: - oh, I figured out why they do this - in their use case most of the time inputs will be short, there is high possibility that upper bits of length are 0, so to have more entropy in upper bits of initial state/seed they replicate lower 16-bits to 4 elements of upper 64-bit part simd register.

  3. specialized implementations for size=0, size=[1..15], size=16, size=[17..32], size=[33..64], size=[65..128], size > 128

  4. state for AES is seeded from 128-byte buffer that is filled by random from system at program startup. This hash function is used in Go internally for dictionary hash table key lookup, so it this kind of guards against attacker doing DOS attack.

  5. interesting way to deal with partial loads. It reuses bytes from buffer for loading trailing bytes, so some bytes are hashed "twice"

  6. all AES operations that scrambles state uses state also as input: x = aesenc(x, x)

  7. in the big >128 size loop, the AES operation is used twice per lane - once to scramble state, second time to add in value from memory

  8. at end of loop / input processing, the AES operation is called 3 times to scramble state on each lane

  9. returned value is simply XOR'ed together from all lanes, no extra AES ops

  10. it uses only lowest 64 bits from result value as return value

I've ported it to C if somebody is interested (verified that result is same as for Go implementation): go_aeshash.h

from meow_hash.

demerphq avatar demerphq commented on May 20, 2024

from meow_hash.

demerphq avatar demerphq commented on May 20, 2024

from meow_hash.

NoHatCoder avatar NoHatCoder commented on May 20, 2024
  • Known SAC value for single Rijndael round is 20 x2 bit (15.0% x2), i.e. 20 bits will flip if one input bit would changed. This means that on average F(x) collects only 40 bits of entropy out of every 128.

Also known than a full SAC cannot be achieved in less than 3 rounds. You can google corresponding publications/papers. Instantly I found this one.

That is a pretty shit paper that generally shouldn't be used for anything, but I'm curious how you claim that SAC takes 3 rounds when the conclusion in the paper says 2 rounds?

In any case, one flipped bit spreads to 32 bits in one round of AES, on average flipping 16 of them, applying another round in turn spreads those bits to the entire state, so two rounds required for full SAC.

My suggestion of doing two rounds for every second block is exactly what is required to take advantage of the two-round full SAC. The pattern would be like so:

state=AESDEC(state,stream1)
state=AESDEC(state,stream2)
state=AESDEC(state,stream2)
state=AESDEC(state,stream3)
state=AESDEC(state,stream4)
state=AESDEC(state,stream4)

This ensures that between the first XORing of a given stream block, and the last XORing of the following stream block there is always two rounds of mixing, this should ensure that generating a collision in O(1) requires modifying 16 bytes of data.

Regarding the collision type demonstrated in my first post, after careful consideration I have reached the conclusion that I can't consider Meow a 128 bit hash with this flaw, given that the pattern required to trigger it is somewhat specific I'm willing to give it an extra 16 bits, but that still only leaves it a 48 bit hash. I believe a moniker of 48 bits reasonably accurately describe how often this hash will collide in mixed non-adversarial real-world use.

from meow_hash.

dgryski avatar dgryski commented on May 20, 2024

@mmozeiko You can compare your implementation with the one from gccgo: https://github.com/golang/gofrontend/blob/master/libgo/runtime/aeshash.c

from meow_hash.

erthink avatar erthink commented on May 20, 2024

[...] Instantly I found this one.

That is a pretty shit paper that generally shouldn't be used for anything, but I'm curious how you claim that SAC takes 3 rounds when the conclusion in the paper says 2 rounds?

@NoHatCoder, I just quickly found it, but did not read a whole. So I agree that this paper has shortcomings.

Moveover, looking to the "Table 2: Results of Avalanche Effect of AES" on page 366, I see that the authors have confused the SAC and the number of flipped bits. In such notation SAC could not be greater than half i.e. 50%.

In other words, SAC == 100% - rms(50% - number_of_flipped_bit) /* has corrected to remove oversimplification */.
if we see that average > 50% bits are flipped, ones just inverted by couples, but this also violates strict avalance criteria.

In any case, one flipped bit spreads to 32 bits in one round of AES, on average flipping 16 of them, applying another round in turn spreads those bits to the entire state, so two rounds required for full SAC.

I suggest you try SMHasher to measure SAC of the Rijndael round.
But this should be done with an understanding of the Rijndael algebra.
In particular, you should consider "bad" combinations that result in minimal changes after MixColumns / InvMixColumns, including leaving data unchanged (01 01 01 01 and so on).


Merkle–Damgård narrow pipe required F(x) with ideal SAC to avoid losing entropy.
I can only repeat this once again and fully agree with the @demerphq's opinion.

P.S.
In the my new t1ha0-aes design I use other construction which resemble the "Fast wide pipe".

from meow_hash.

cmuratori avatar cmuratori commented on May 20, 2024
state=AESDEC(state,stream1)
state=AESDEC(state,stream2)
state=AESDEC(state,stream2)
state=AESDEC(state,stream3)
state=AESDEC(state,stream4)
state=AESDEC(state,stream4)

Hmmmmmm. I will have to think about the extent to which we could incorporate that without slowing things down too much. Unfortunately AES instructions are all on the same port, AFAICT, so you cannot issue two independent AES instructions in the same cycle. This would seem to imply that you could not get the full 16 bytes/cycle if you add an additional AESDEC into the main loop.

But you could add other instructions, potentially. Some other instructions may be on different ports (XOR, for example, must have something like 3 ports because it is listed as a 0.33... cycle instruction). So I wonder if there are any other instructions that could be use to strengthen the hash that would not slow things down.

- Casey

from meow_hash.

erthink avatar erthink commented on May 20, 2024

But you could add other instructions, potentially. Some other instructions may be on different ports (XOR, for example, must have something like 3 ports because it is listed as a 0.33... cycle instruction). So I wonder if there are any other instructions that could be use to strengthen the hash that would not slow things down.

Yeah, try it. It will be interesting to compare the scores.
I think later I can show the MeowHash and t1ha results on the Russian's VLIW called "Elbrus".

from meow_hash.

cmuratori avatar cmuratori commented on May 20, 2024

We ended up redesigning the hash for v0.5, so I am closing this issue since it's not really something we can address further without specifics to v0.5 that should be opened in new issues so they don't get buried here.

We did try to address the shortcomings of AES without downshifting to two full AES rounds per ingest, and so we'll see how it pans out :)

- Casey

from meow_hash.

dgryski avatar dgryski commented on May 20, 2024

Where is the version 0.5 hash? There doesn't appear to be a branch.

from meow_hash.

cmuratori avatar cmuratori commented on May 20, 2024

from meow_hash.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.