Please refrain from using the false SipHash security claims and adding your own nonsen

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Agree with <a class="user-mention notranslate" data-hovercard-type="user" data-hoverca

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

False security claims,about google/highwayhash

Comments (34)

veorq commented on May 5, 2024 23

I'm sorry but there are too many incorrect or unsubstantiated claims here. PoC || GTFO :-)

…

On Fri, 25 Nov 2016 at 17:27, Reini Urban ***@***.***> wrote: @veorq <https://github.com/veorq> thanks for showing up. djb only hid behind funny twitter allegations. I have nothing against your siphash paper section 3 claims. They are sound. What is unsound is the usage in popular discussions such as here, and in most dynamic languages using parts of your claim in section 6 and esp. 7 to add crypto-nonsense you didn't write. What is quoted from section 6 (performance) is that SipHash is fast. It is not. Is is slow. Though everything is relative. For you it might look fast enough, but for hash table usage it is slow. And it doesn't help with security. Section 7 is the real culprit in the security claims. It argues that a hash function alone helps in defense against hash flooding. No, this is dangerous nonsense and needs to be debunked. Only the collision resolution scheme helps against hash flooding. Chasing linked lists or linear addressing keys without counter measures are the problem, and the hash function itself cannot do anything against it, other than being not entirely stupid. No strong hash function can help against the worst case: a known hash function, known collision resolution and a known seed. Not even SHA256 or a fast AES intrinsic. Almost every single paragraph from section 7 is flawed and needs to be retracted. This is using hash tables from the 80ies, but even Knuth considered using ordered tables against hash flooding already in the 70ies. ("Ordered hash tables", O Amble, D Knuth 1973) Review of hash tables: knows only about chained lists, and tells nothing about the simplest countermeasure your colleague djb did in his implementation. Count and stop after excessive collisions. Review of hash flooding: tells only about the earliest attacks, but nothing about timing attacks or order analysis to get to the seed. Advanced hash flooding: here is good info about timing attacks against universal hashing. Stopping advanced hash flooding: Your proposal of a cryptographically strong PRF is baseless on the simpliest attacks using only a tiny subset of bits. You do not warn against seed exposure. Even with AES an attack can be brute-forced in 4 minutes. The Python hash function: Good warning, but they only listened to you, not to any hash function or table expert who would have told them otherwise. Now not only python, also ruby, rust and many more are vulnerable. Regarding solvers to expose the seed: As we know this wasn't done yet publicly, otherwise the internet would cry a bit as last time. REMOTE ALGORITHMIC COMPLEXITY ATTACKS AGAINST RANDOMIZED HASH TABLES 2009 https://www.eng.tau.ac.il/~yash/C2_039_Wool.pdf can easily be improved by using solvers. I used cbmc. But most hash tables don't even try to hide their order or seed. Nothing is mentioned in section 7. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#28 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADDmytNDnkq8s7XuPOwcOZ_r7K2T6v_qks5rBwxYgaJpZM4K8eG6> .

from highwayhash.

veorq commented on May 5, 2024 22

Hi Reini,

SipHash co-author here.

You argue that SipHash is not "cryptographically strong". The research paper about SipHash at http://aumasson.jp/siphash/siphash.pdf makes specific claims about its cryptographic strength, in section 3. As of today none of those claims has been contradicted.

In section 7 of that same paper we discuss in more detail why SipHash (and more generally a secure PRF) mitigates hash-flooding vulnerabilities. Is there any specific point you disagree with?

Also note that SipHash is not a "hash function", but a pseudorandom function (PRF), or keyed hash function. In other words, SipHash is an algorithm that takes a key as a parameter, in addition to the message.

from highwayhash.

tarcieri commented on May 5, 2024 11

Yes, we got that several posts earlier. The point is, it is incorrect and dangerous to say "$hashfunction protects against hash flooding" because the latter is an attack on hash table implementations and not hash functions.

Hence the very disingenuous nature of the entire statement "At least rust found out last month that this is nonsense, but are still claiming that SipHash is secure." That statement is colluding the security of a hash table implementation with the security of a PRF. And this in the OP of a thread about "false claims". The OP is being disingenuous in the very exact same way he is accusing others of doing.

Not a single hash function can be strong enough to protect against seed exposure in hash tables. 32 or 64 bit can never be secure. The seed is not protectable by any hash function. It is usually trivial to expose it.

Exposure of the seed is well outside the threat model of hashDoS. You are talking about an attacker who has access to the host and later the ability to inspect arbitrary memory ("For a running perl process the seed is at a fixed offset from my_perl, the first argument to any VM op. Which is easily readable via some kind of poke function via the unpack P builtin op.")

You could make a similar specious case against e.g. ASLR: that it's not a silver bullet, that the security it provides depends on the ASLR implementation, that an attacker who can arbitrarily inspect memory can bypass ASLR easily (duh?). And it'd be just as disingenuous as what you're doing in this thread: despite those problems, ASLR still provides security value.

My take on this thread is the OP doesn't understand the threat model of hashDoS and is making a number of exaggerated, disingenuous strawman arguments against claims people aren't actually making. If he would like to make a case against the false claims of others, he needs to stop making the very same false claims he is accusing others of.

from highwayhash.

jyrkialakuijala commented on May 5, 2024 6

@rurban and others. First, I'd like to thank rurban to start this discussion. This is a good discussion to have, but we are talking about too many topics at once to make progress. I'd greatly prefer this discussion to be more tightly scoped.

For example, rurban's proposal of replacing highwayhash with crc32-c should be a separate issue that could be studied and discussed alone without discussing the properties of siphash vs. xxHash or timing attacks. Every one of these issues is tremendously complex, full of subtle details and having all of this is a single issue is not serving the purpose of clean discussion.

I will close this issue in about one week (most likely I will delete it some time after, since it doesn't represent the kind of discussion style that I find appropriate), and hope that rurban and others move the discussion to multiples of filed issues and that we can keep the discussion focused on that particular topic in the issue. I'd rather have more issues (even 10-20) to keep the discussion as tightly scoped as possible.

Just looking at the discussion I'm proposing following topics:

Clean up security claims in the documentation
Use CRC32 intrinsics in highwayhash
Use AES intrinsics in highwayhash
Make it more clear what is required from a hash table to work safely together with a hash function
Outline the possible security and quality differences of highwayhash, siphash and cityhash.
Benefits of SipHash in hashtable use
Discovery of seeds in hashtables
Adding attack code to the repository to show how bad the proposed solutions are
Other topics you'd like to discuss

I understand that we are all passionate about these topics, but I'm sure we can talk about these topics without the need of crossing the border of good manners. I will moderate the comments of this issue and the future issues as I see necessary to keep the discussion relatively polite.

from highwayhash.

jyrkialakuijala commented on May 5, 2024 4

Please, let's refrain from more personal attacks on this issue. Let's divide the discussion into multiple threads and try to make this into something positive for all of us.

from highwayhash.

veorq commented on May 5, 2024 4

Agree with @tarcieri. IMHO you'll be wasting your time. I'll stop feeding trolls, but feel free to ping me for any serious question about SipHash.

from highwayhash.

infinity0 commented on May 5, 2024 3

The shit-talk and minion-emoting here is embarassing. Twitter is poisoning your brains.

It sounds like too many people are getting overly-emotional about the precise definition of "secure" vs "insecure" and what that is applied to, but @rurban's points make perfect sense and I would not be surprised if there was a PoC developed in the next few years, if this trend continues and people use good hash functions inside bad hash table implementations.

Do you have a link to the rust issue that you pointed out?

from highwayhash.

tarcieri commented on May 5, 2024 3

At least rust found out last month that this is nonsense, but are still claiming that SipHash is secure.

Rust's latest problems are the result of a particularly bad interaction of all of the parts of the implementation in conjunction with a particular usage pattern (iterating over a HashMap and using it to populate another HashMap). It arises from the interactions between Rust iterators, ordered HashMap traversal, Robin Hood hashing (which Rust HashMaps use internally), and SipHash. It's a nuanced problem arising from pathological cases in the current implementation which are triggered by a potentially common usage pattern, not a fatal flaw in SipHash.

from highwayhash.

jyrkialakuijala commented on May 5, 2024 1

@jyrkialakuijala Maybe open one issue for each of the first 8 specific items listed?

@dgryski Yes, please go ahead. I don't know if the ones I proposed are the correct ones, I just tried to capture some that were possibly mentioned in the discussion. Also, the one who posts them should believe that they are actually a good idea. Otherwise we might not have a constructive discussion afterwards.

From my side, I can promise that we (Jan and me) will address all concerns seriously and in good faith. Neither Jan or me are famous crypto researchers -- highwayhash is our first attempt in this direction -- so you might need to give us some leeway in defending our position.

from highwayhash.

jan-wassenberg commented on May 5, 2024

I think hash tables are a less interesting use case than fingerprints/'random' data-dependent decisions.
Fingerprints are short-lived, so we are OK with MAC-level security. I cringe to see more insecure functions used for such a purpose.

FYI we have studied differential/length extension/rotational/SAT solver attacks against HighwayHash without finding a weakness (paper pending publication).

from highwayhash.

Bulat-Ziganshin commented on May 5, 2024

14 bit collisions (16383 keys) need <10s to calculate for SipHash brute-force

it should be the same with SHA, does that make it non-cryptographic too?

These false claims of djb and yours are spreading, and are actually doing damage in the dynamic language community, who are using simple insecure hash tables with SipHash.

can you show that this damage is due to use of SipHash instead of SHA, rather than their own mistakes?

Python 3.4, ruby, rust, systemd, OpenDNS, Haskell and OpenBSD are all following this security theatre nonsense, and they think that by using SipHash they are now immune.

of course, using of PRF or cryptographic hash functions doesn't guarantee security. is there something specific to SipHash or PRF in general that we can do to improve their practice of using SipHash?

from highwayhash.

rurban commented on May 5, 2024

@Bulat-Ziganshin SHA{1-3} are thanksfully not advertised as DoS resistant hash table function.
SipHash is used as 32 or 64bit hash table function, SHA as a mostly 256 bit cryptographic hash function.

So yes, a 14bit rest of SHA is of course non-cryptographic, and a 32bit SipHash is also non-cryptographic. Everything less than 128-256 bits and easily crackable hash is not crypto.

The paper is talking about a strong PRF. Yes, it is a strong PRF, and maybe even a strong HF (hash function), but not a cryptographic strong HF, and at all not a secure HF. But this is what is being advertised here. At lot of people are taking this at face value and forget about real protection, if it's their seed, the ordering, the collisions.

from highwayhash.

Bulat-Ziganshin commented on May 5, 2024

i think that definition of cryptographic hash doesn't depend on size, but on inability to restore hashed data otherwise but brute-force? anyway, SipHash isn't a cryptohash and hopefully HighwayHash also removed its false advertizing as cryptohash (rather than PRF).

while SHA isn't advertized in this way, it's considered as DoS protection due to properties of cryptohashes.

The key point to DoS protection is to keep low population of hash buckets. It's not enough to find a single collision to make a DoS attack. Can you show the attack that allows to put in the single bucket some significant portion of keys (such as 15) without knowing the SipHash seed value? Or may be describe some other way to DoS-attack system that employs hashing based on bits from SipHash and keeps number of buckets >= number of keys pushed to the hash table?

i wonder what you recommeend to use for hashing instead of SipHash? is your recommendation is to use hash tables with at least 2^256 buckets to guarantee lack of collisions?

from highwayhash.

rurban commented on May 5, 2024

@veorq thanks for showing up. djb only hid behind funny twitter allegations.

I have nothing against your siphash paper section 3 claims. They are sound. What is unsound is the usage in popular discussions such as here, and in most dynamic languages using parts of your claim in section 6 and esp. 7 to add crypto-nonsense you didn't write.

What is quoted from section 6 (performance) is that SipHash is fast. It is not. Is is slow.
Though everything is relative. For you it might look fast enough, but for hash table usage it is slow. And it doesn't help with security.

Section 7 is the real culprit in the security claims. It argues that a hash function alone helps in defense against hash flooding. No, this is dangerous nonsense and needs to be debunked.
Only the collision resolution scheme helps against hash flooding. Chasing linked lists or linear addressing keys without counter measures are the problem, and the hash function itself cannot do anything against it, other than being not entirely stupid.
No strong hash function can help against the worst case: a known hash function, known collision resolution and a known seed. Not even SHA256 or a fast AES intrinsic.

Almost every single paragraph from section 7 is flawed and needs to be retracted.
This is using hash tables from the 80ies, but even Knuth considered using ordered tables against hash flooding already in the 70ies. ("Ordered hash tables", O Amble, D Knuth 1973)

Review of hash tables: knows only about chained lists, and tells nothing about the simplest countermeasure your colleague djb did in his implementation. Count and stop after excessive collisions.

Review of hash flooding: tells only about the earliest attacks, but nothing about timing attacks or order analysis to get to the seed.

Advanced hash flooding: here is good info about timing attacks against universal hashing.

Stopping advanced hash flooding: Your proposal of a cryptographically strong PRF is baseless on the simpliest attacks using only a tiny subset of bits. You do not warn against seed exposure. Even with AES an attack can be brute-forced in 4 minutes.

The Python hash function: Good warning, but they only listened to you, not to any hash function or table expert who would have told them otherwise. Now not only python, also ruby, rust and many more are vulnerable.

Regarding solvers to expose the seed:
As we know this wasn't done yet publicly, otherwise the internet would cry a bit as last time. REMOTE ALGORITHMIC COMPLEXITY ATTACKS AGAINST
RANDOMIZED HASH TABLES 2009 https://www.eng.tau.ac.il/~yash/C2_039_Wool.pdf can easily be improved by using solvers. I used cbmc.
But most hash tables don't even try to hide their order or seed.

Nothing is mentioned in section 7.

from highwayhash.

Bulat-Ziganshin commented on May 5, 2024

can easily be improved by using solvers. I used cbmc.

can you show how to restore hash seed from hash values, using solver or any other technique you like? i think it's a principal point. Both PRFs made a lot of work to ensure that it's impossible, the rest is more problem of applications using PRF in inappropriate way.

SipHash is widely used now, so hopefully cryptohraphers work hard on breaking it. If you have a technique breaking it, you will be respected as good cryptographer.

from highwayhash.

rurban commented on May 5, 2024

@Bulat-Ziganshin The attack is trivial as explained two fold. First you get the seed via various means, i.e. by the timing attack as of WOOL 2009 (but better via solvers and exposed ordering).
I wont publish my solver for faster seed detection via order exposure. WOOL via timing should be enough. SipHash makes no difference. It's timing the slow collision chain walk, not the hash.
Then you brute force any "cryptographic secure PRF" or "cryptographic secure hash function", as only 10-14 bits of the key are required to perform an practical attack, which is done brute-force in 10s - 4m (SipHash), even up to huge hash tables with 28bits. There's no protection from a hash function at all, other than being not trivially reversible, or computable as shown in the 2 CCC talks about earlier djb xor-add and mult hash functions.

Hence the claims that a strong hash function is enough to protect against hash floods are false.
Hash tables need fast and not-trivially crackable hash functions, not pseudo-secure hash functions as falsely advertised.

@veorq Your claims from section 7 are unsubstantiated and need to be corrected. They did too much damage already.

@Infinity rust: http://accidentallyquadratic.tumblr.com/post/153545455987/rust-hash-iteration-reinsertion (bad robin hood implementation)
For the ruby and python discussion on their new hash tables and false security google fu.
Also technically the claim "cryptographically strong PRF" is correct, but nobody reads it that way. Everybody reads it as "cryptographically strong hash function", nobody uses it as a PRF, which is together with the section 7 claims the real culprit. False citation, misleading study of small part of the problem, non-expertise of the implementors who only look at the name.

from highwayhash.

rurban commented on May 5, 2024

BTW:
If you want to advertise a fast cryptographical string hash function for hash table use, take the AES intrinsic, and not siphash or highwayhash. This is stronger and faster.

If you need a hash function with the best distribution qualities, i.e. least number of collisions for your typical ascii or utf8 key usage, advertise CRC32-C, which is also the fastest when using the intel/arm intrinsics. of course this is trivially exploitable, so you need proper collision resolution (which you need anyway).

siphash or highwayhash come nowhere close to be recommendable for quality nor security.

from highwayhash.

infinity0 commented on May 5, 2024

Hang on, I thought your point was that hash table implementations can't assume they're secure just because they use SipHash or any other hash or keyed-hash function. That is what the rust issue was - it had little to do with SipHash itself, but with the hash table implementation surrounding it and how the seed was chosen.

Now you are saying you prefer AES - but I don't see where you explain why this is so, at least in the current thread. And this is a separate point from the previous one. When you say "hash tables need fast and not-trivially crackable hash functions", could you define that more precisely, and explain why SipHash doesn't fit this definition?

from highwayhash.

rurban commented on May 5, 2024

Hang on, I thought your point was that hash table implementations can't assume they're secure just because they use SipHash or any other hash or keyed-hash function. That is what the rust issue was - it had little to do with SipHash itself, but with the hash table implementation surrounding it and how the seed was chosen.

Correct.

Now you are saying you prefer AES

No. I prefer a simple, fast and small hash function as everyone else. FNV1 if bytewise or something better if I can use words or add NULs for alignment.

but I don't see where you explain why this is so, at least in the current thread. And this is a separate point to the previous one. When you say "hash tables need fast and not-trivially crackable hash functions", could you define that more precisely, and explain why SipHash doesn't fit this definition?

I was recommending AES when someone needs a fast and secure hash table function, but it still does not help against DoS.

When you say "hash tables need fast and not-trivially crackable hash functions", could you define that more precisely, and explain why SipHash doesn't fit this definition?

SipHash is the slowest of all measured hash function usable for hash tables.
See my popular smhasher stats page: https://github.com/rurban/smhasher/#smhasher
Using SipHash doesn't help security, but slows down the hash table by factor 2-5.

I haven't added this improved highwayhash yet. But my estimate is that is still way too slow for a hash table, and adds no significant security. Talking about secure hash tables is entirely different and more complicated topic. What I said here is only that those repeated claims that a hash function make a hash table secure against DoS attacks is false. Only a hash table collision resolution can protect from that, the hash function has very little influence on it. Other than being exploitable and reversible as shown in various papers and talks.

Some statistics for the average case, the quality of hash table functions are at linked there: https://github.com/rurban/perl-hash-stats with SPOOKY32 and CRC32-C with the least collisions overall. There's also a bit more of explanation for those who don't have an idea about hash table statistics, which I assumed is common knowledge here.

from highwayhash.

Bulat-Ziganshin commented on May 5, 2024

rust: (bad robin hood implementation)

10x, it's very interesting reading, but it doesn't indicate any flaws in SipHash. Results will be excatly the same with SHA, any AES-based hash or FNV. The key part of problem is that hash values are the same for both tables, and it's solved by seeding hash for each table individually. Indeed, PRFs are natural choice here.

from highwayhash.

rurban commented on May 5, 2024

And sorry for interact again, but djb said this nonsense:
"A bold claim without evidence from @Reini_Urban in https://github.com/rurban/smhasher/ …: generic SMT solvers can find keys for SipHash and (seeded) SHA1." https://twitter.com/hashbreaker/status/795990395490603009
which is of course wrong. But I don't discuss on twitter with him.

I claimed that I can find the seed with solvers, and if needed the relevant parts of the keys to perform a practical attack. Others successfully attacked non-seeded SHA2 with it (google it), I only SipHash. But this is not even needed, brute forcing is enough.

from highwayhash.

rurban commented on May 5, 2024

@Bulat-Ziganshin

10x, it's very interesting reading, but it ddoesn't indicate any flaws in SipHash. Results will be excatly the same with SHA, any AES-based hash or FNV. The key part of problem is that hash values are the same for both tables, and it's solved by seeding each

There is no flaw in SipHash and probably neither in HighwayHash. (haven't tested that yet).
The flaw is the documentation and false advertising in the README part:

"cryptographically strong pseudo-random function" => technically correct, but nobody reads it that way, in the hash table context.

"Expected applications include DoS-proof hash tables and random generators." => false for hash tables, true for random generators.

"SipHash is immune to hash flooding because multi-collisions are infeasible to compute. This makes it suitable for hash tables storing user-controlled data." => totally false. here only the seeding scheme protects the table from flooding. but even seeds can be exposed or calculated. practically shown in 2009 already. re rust: table or size specific seeding is basically called "uniform hashing" which is provable secure. no need to argue there.
but when the seed is exposed the security is gone again. their iterator exposes the ordering, and timing attacks can still be performed. Not that anyone wants to do that, because practical DoS floods are trivial on different levels. But the security claim is still false.

from highwayhash.

Bulat-Ziganshin commented on May 5, 2024

If you want to advertise a fast cryptographical string hash function for hash table use, take the AES intrinsic, and not siphash or highwayhash. This is stronger and faster.

can you prove that point? indeed, AES-128 algorithms is well studied, but it needs 10 rounds, i.e. 10 AESENC instructions per each 128 bits of result, making it actually slower than HighwayHash. Singe AESENC instruction, as used by some hashes, doesn't guarantee anything.

CRC32-C, which is also the fastest when using the intel/arm intrinsics

FARSH is even faster. Using AVX2, it proceeds data in 32-byte chunks rather than 8-byte ones.

I claimed that I can find the seed with solvers,

Since you don't plan to publish your results, i think we can mute this part of discussion.

Other than being exploitable and reversible as shown in various papers and talks.

can you please present these sources?

https://www.eng.tau.ac.il/~yash/C2_039_Wool.pdf

It researches timing attack for recovery of "13-bit Secret Values", i.e. 13-bit seed. SipHash uses 64-bit seeds. Their conclusion is:
"We have demonstrated that a remote algorithmic complexity
attack, against randomized hash tables, is possible
if the secret value is chosen from a small enough
space. More secret bits cause more effort, time and
space to be consumed in the information gathering
stage. Thus, it seems that a random value of 32
bits would render this attack impractical with today’s
technology"
So i think they poved that SipHash isn't vulnreable to their attack.

Sorry, after checking 2 of your sources i don't see any reasons to hear you anymore.

from highwayhash.

rurban commented on May 5, 2024

@Bulat-Ziganshin

can easily be improved by using solvers. I used cbmc.
can you show how to restore hash seed from hash values, using solver or any other technique you like? i think it's a principal point. Both PRFs made a lot of work to ensure that it's impossible, the rest is more problem of applications using PRF in inappropriate way.

I restore hash seeds from a combination of exposed timing and ordering info, knowledge of the hash function and hash table, with the unknown being the seed.
I'm not an academic, my job is to protect 70% of the internet users from such attacks.
Which is about the coverage of our apps. I'm talking CISCO style exposure here.
Exposing and making it easier to perform such attacks is not in our interest.
So no, only in private, but I don't know you. Attacks are practical now already, without any solver, just not easy enough for hacker joe.

SipHash is widely used now, so hopefully cryptographers work hard on breaking it. If you have a technique breaking it, you will be respected as good cryptographer.

My point is that there's no need to break SipHash at all.
You only need to find the seed. For which practical exploits have already been shown 2009 via timing of the collisions. This is completely independent on the hash function. You can also break 256bit AES hash tables with that. Mine only performs it faster and uses much more info than this trivial scheme I linked above.

SipHash is a bit better because it mixes the seed into the box. But I need only the last few bits to match. This is not a typical hash cracking for the full 32bit or 64bit range. This is only for say 10bit, with e.g. a simple server processing JSON or whatever. That's 10 bool variables in a SAT. With cbmc or klee you just solve it for those variables, no need to translate your great hash function to Z3 or whatever. People are doing that already to break bitcoin with much more variables.

I created for my fork of perl, cperl the continuation of perl development, pre-calculated hash floods in less than 0.003s for 7 different hash functions per seed.
See https://github.com/perl11/cperl/blob/master/t/op/hashflood.t for the siphash variant, 8bit and seed=0 https://github.com/perl11/cperl/blob/master/t/op/seed-siphash-8-0.dat
and for the protection: perl11/cperl#199
No source for the creation program, sorry. There's enough data for you.

comparable timings for UNSAT of sha(sha()) are at https://jheusser.github.io/2013/02/03/satcoin.html
cbmc => 1m with much more variables and harder functions, if you are not aware of current possibilities

Practically I could find collisions in less then 4m for up to 28bit, which are hash tables of size 268_435_455, which is more than e.g. PHP allows. Up to 14bit it's trivial, 15bit need 30s, and starting with 16bit (64k keys) it needs 2-4m. That's way too much information already. 29-32bit are pretty safe, but you don't need that many keys to perform a good DoS. 8-32K is enough, which is all in the <1 minute range.

And no, I will not present this at the CCC or whatever. You just need to adjust your wording. SipHash is not a secure hash function for hash tables. It's brute forced in less than 10 seconds for up to 14bits, which is enough for a practical hash table attack. There's no need for a cryptographer here, as there's no cryptography involved. Just a lot of theatre.

from highwayhash.

jan-wassenberg commented on May 5, 2024

To expand on Bulat's observation: Wool's attack (actually from 2007) sends "enough" packets for every possible hash seed, which for 13 bit seeds (!) and 150 us roundtrips (!) took several days. For my education, can you estimate how long your solver might take to find a 64-bit seed?
If you're relying on only recovering the lower log2(table_size) bits, wouldn't folding all hash bits into the result via % defeat that?

It is surprising to see FNV1 and AES recommended. AES can be good for finalizing a hash but block ciphers are not designed to withstand related-key attacks and 128 bits are not enough of a security margin => AESNI is not (by itself) sufficient for cryptographic hashing.
Instead, it would be interesting to see FARSH/zzHash mentioned in rurban/smhasher.

I'll be traveling for a week and slow to reply, but can afterwards add some mention to the readme that a reasonably secure hash function is part of a hash flooding solution.

from highwayhash.

dgryski commented on May 5, 2024

@jan-wassenberg The Go runtime uses AES instructions for the internal (not user-visible) hash implementation. It is seed per-table.

https://github.com/golang/go/blob/master/src/runtime/asm_amd64.s#L886

from highwayhash.

rurban commented on May 5, 2024

@jan-wassenberg: timings attacks need a lot of time, yes. the primary information is the ordering, which only drives the timing for confirmation or rejection. a good strategy involves a model to find the search tree, hence a solver. maybe you get the idea. locally (without the 150ms roundtrip) it needs 5s.

smhasher is mostly expanding on hash table security, not hash function security. smhasher itself is not of much help for hash table functions at all, more for block digests. but since people were using it to evaluate hash functions for hash tables I had to debunk those claims there. "reasonable secure" for you is not providing security in the current worst case, an attack with known seed, function and collision resolution scheme. esp. since there are faster and better schemes available. there's no need to cripple you language or server with a slow hash function for the worst case only, which is trivially defended otherwise.
A hash function is part of the speed and quality of a hash table, and part of the insecurity of a hash table, but never part of a hash flooding solution. You need to fork it to add your wrong and dangerous but very widespread claims.

from highwayhash.

Bulat-Ziganshin commented on May 5, 2024

@jan-wassenberg afair, the fastest hash in rurban's survey uses AESENC instruction as its way to shuffle bits, just like you are using PMUL+PSHUFB. My wild guess is that rurban doesn't know semantics of this operation and believes that single AESENC instruction performs full 10 rounds of AES-128.

and this Go implementation uses 3 rounds of AES hashing.

from highwayhash.

rurban commented on May 5, 2024

@Bulat-Ziganshin Do you see my AES there? No, because I didn't write one yet.
I wrote the CRC part with a couple of rounds for the missing bits. A single round of AESENC does better mixing than CRC32 already, so it should be better than it. For hash tables a single xor-add or mult is enough. Every cycle counts. More collisions are no problem when the cycle count is better. collisions should be in the cache anyway.

https://github.com/golang/go/blob/master/src/runtime/asm_amd64.s#L887 looks like massive overhead for their hash table, but good to have for crypto.

from highwayhash.

infinity0 commented on May 5, 2024

Yes, we got that several posts earlier. The point is, it is incorrect and dangerous to say "$hashfunction protects against hash flooding" because the latter is an attack on hash table implementations and not hash functions. It would be like saying "AES protects against CPA/CCA attacks", no AES inside an appropriate block cipher mode of operation is what protects against CPA/CCA.

from highwayhash.

rurban commented on May 5, 2024

@infinity0 exactly. Not a single hash function can be strong enough to protect against seed exposure in hash tables. 32 or 64 bit can never be secure.
The seed is not protectable by any hash function. It is usually trivial to expose it.

E.g. in debian perl you get the seed at process startup via

$ PERL_HASH_SEED_DEBUG=1 /usr/bin/perl -e0
=> HASH_FUNCTION = ONE_AT_A_TIME_HARD HASH_SEED = 0xd12d459fc36db4cf PERTURB_KEYS = 1 (RANDOM)

to stderr.

On centos7:

$ PERL_HASH_SEED_DEBUG=1 /usr/bin/perl -e0
=> HASH_SEED = 10452142639498245987

For a running perl process the seed is at a fixed offset from my_perl, the first argument to any VM op. Which is easily readable via some kind of poke function via the unpack "P" builtin op. See Devel::PeekPoke. Similar for all other dynamic languages exposing the value of any pointer.

There's not much need to come up with fancy attacks against the timing and order.

Perl5 at least changed the order of the iterator, cperl counts the collisions and adds a sleep on attacks, PHP limits the size of external keys to be passed to the hash table so only JSON or other formats are easily DoS-able. But more serious applications such as the linux kernel, glibc, stdlibc++ cache or DNS servers use better hash table collisions schemes than unsafe chaining or linear probing, and don't believe the SipHash authors.

from highwayhash.

dgryski commented on May 5, 2024

@jyrkialakuijala Maybe open one issue for each of the first 8 specific items listed?

from highwayhash.

jan-wassenberg commented on May 5, 2024

@infinity0 has a fair point, there isn't much discussion of the interaction between hash functions and tables. I propose some explanation in #29, comments are welcome there.

from highwayhash.

jyrkialakuijala commented on May 5, 2024

As promised earlier, I'm marking this closed. I'd like this discussion to happen, just not in this form. Please, keep every topic as separate as easily possible and maintain a tightly scoped discussion in the issues.

from highwayhash.

False security claims about highwayhash HOT 34 CLOSED

Comments (34)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent