skarupke / flat_hash_map Goto Github PK

View Code? Open in Web Editor NEW

1.7K 72.0 184.0 26 KB

A very fast hashtable

C++ 100.00%

flat_hash_map's People

Contributors

Stargazers

Watchers

Forkers

olivierh59500 jpluimers jiaqiang ring2003 rjmcguire v-a-kernel hal2001 xclshwd alik-eth svalkaforkov skobets expandiumsas tensosoft s1l2 mzsk bpxeax sheldonlyr krs43 stwhite91 daveboutcher xnoreq jcelerier jonahharris philippe91 buddhavs ares7 claudiouzelac data-storage joelone kevinlingit linecode sunsongxp paulyc suelsp mktrettin datascientist1976 chadbrewbaker deniall ezhangle birowo randrew noobxinyu kevycheng cyd0n1a smorin renesugar wenchaochen lsalamon lexmao amanieu skyformat99 andaluri optimalbucket no-realm brandon-kohn zertyz kaidokert renau wehavecookie oahia01 trizinix faragon shanbaoyin sthagen smessmer awagelight antialize kramelec dnbaker bedvit usefulcat denise-amiga silentmoebuta voutcn nightwish08 hellengineer eendru jiancui1992 nayansen90 lileimingllm wevre feiyunwill ctzhenghua mohammeddahhani adenzhang xiaolei20190417 cheenng pps83 nekrolm simon-r studiovc tmothupr heinezen slekhano isndev opencai marekkokot ekahssay tleplaearthur wenhuachine

flat_hash_map's Issues

Concurrent Access

I intend to start using your bytell_hash_map in my project, so my question applies to it specifically, but I assume the answer will apply to all your hash maps...

Are concurrent reads from multiple threads thread-safe? I skimmed the find implementation and see no reason why reads would not be.

I (with 99% certainty) assume writes are not.

Error building on the MacOS 10.14 SDK

Clang gives me this error:

bytell_hash_map.hpp:644:74: error: 'auto' not allowed in lambda parameter
std::sort(depth_in_chain.begin(), depth_in_chain.end(), [](const auto & a, const auto & b) { return a.first < b.first; });

Is there the project license?

I want to edit some sorce code and import it in my project.
Can I do it?

Usage of static data in empty_default_table causes issues across modules

These two pieces of code cause issues when the flat_hash_map is used across modules (e.g. being shared between a DLL and the executable, which is a common use case for me):

    static sherwood_v3_entry * empty_default_table()
    {
        static sherwood_v3_entry result[min_lookups] = { {}, {}, {}, {special_end_value} };
        return result;
    }

    void deallocate_data(EntryPointer begin, size_t num_slots_minus_one, int8_t max_lookups)
    {
        if (begin != Entry::empty_default_table())
        {
            AllocatorTraits::deallocate(*this, begin, num_slots_minus_one + max_lookups + 1);
        }
    }

This is because each module will have its own static empty default table, which will cause an issue with deallocation, corrupting the heap.

I assume there are other pathological cases, e.g. two threads could initialize that static at the same time, resulting in two hashtables with different defaults and eventually leading to a heap corruption.

I have successfully worked around the issue by tagging the first sherwood_v3_entry with a flag indicating its the default table, and checking for that instead, however this will cost one byte if T has alignment of 1 byte (otherwise it should occupy the padding space anyway). There might be a better solution, but I'm not familiar enough with the code to find it!

Here's a link to my changes on my own repo if they're of any use.

Missing end(size_t n) form of iterator?

I made a quick attempt to add this into my codebase as a drop-in replacement for std::unordered_map. Mostly it worked fine but there were a few errors. I haven't tracked them all down, but one of them is:

         for (TableMap::local_iterator it = tables.begin(bucket); it != tables.end(bucket); it++) {
              ^~~~~~~~
 ...
     const_iterator end() const
                    ^~~
ska/bytell_hash_map.hpp:402:20: note:   candidate expects 0 arguments, 1 provided

It's trying to use this form of end() from http://www.cplusplus.com/reference/unordered_map/unordered_map/end/

local_iterator end (size_type n);
const_local_iterator end (size_type n) const;

Thanks!

Creation of default values can be incompatible with convertible_to_value structure

This code will not compile (in MSVC 2019 or clang 12)

#include <string>
#include <ska_flat_hash_map.hpp>

struct Value
{
    std::string val;
    Value() {}
    template<typename T> Value(const T& t) : val(std::to_string(t)) {}
};

int main(int, char**)
{
    ska::flat_hash_map<int, Value> m;
    m[10] = 42;

    return 0;
}

What ends up happening is internally it routes through the templated constructor and std::to_string<ska::flat_hash_map<int,Value,ska::detailv3::IntegerIdentityHash,std::equal_to<int>,std::allocator<std::pair<int,Value>>>::convertible_to_value> tries to be called, which does not compile.

perhaps instead of using the conversion operator, an explicit default construct call can happen instead, ensuring the empty constructor is called.

sherwood_v8_block::empty_block() oddities

I noticed that sometimes sherwood_v8_table::emplace() would just fail even if the instance was empty(no elements). It turns out, that this happens because sometimes sherwood_v8_table::empty_block() would return a different pointer than you would get if you, say, printed that in the constructor, and if that 'd happen, sherwood_v8_table::emplace() would end up invoking rehash(), and that would, in turn, invoke deallocate_data() and (begin == BlockType::empty_block()) evaluate to false and so it 'd try to deallocate memory that was never allocated.

t made no sense, and, while I did put some time into trying to understand it, nothing immediately obvious was found, and it was only reproducible under specific conditions. I suspect it manifests when dealing with dlls/dsos. One quick fix involved changing the definition to:

static inline sherwood_v8_block *empty_block() {
        static std::unique_ptr<int8_t[]> data([] {
                auto ptr = new int8_t[BlockSize];

                std::fill(ptr, ptr + BlockSize, sherwood_v8_constants<>::magic_for_empty);
                return ptr;
        }());

        return reinterpret_cast<sherwood_v8_block *>(data.get());
}

Which compiler and compiler condition do you use?

I can't work through your code with g++ 4.8.5 on linux,but it works in visual studio 2017 on win10,so what condition did you use?

security

Thanks for your great work.

Did your design consider these issues?

https://www.freecodecamp.org/news/hash-table-attack-8e4371fc5261/

Feature: Use fastrange for hash_policy

fastrange provides a fast method to map machine words to a given range. This could be used as part of a hash policy. I'm not certain that it would be faster than the existing Fibonacci hash policy, as it would require 128-bit arithmetic if size_t is 64 bits wide.

Bug: Fails to compile when size_t is not the same size as unsigned long long

In the implementation size_t is used. But the implementation of flat_maprequires this type to be the same size as unsigned long long. But this is not true on all platforms (e.g. emscripten). Hence, when compiling one gets a hard-error for narrowing conversions from unsigned long long to <unsigned type of smaller size>.
The bug is easily fixed by defining

using ska::size_t = unsigned long long

And replacing the size_t with ska::size_t in the implementation.

Also note that when size_t is of size 4 one gets a warning for overflow in some expressions.

constexpr problem in Visual Studio 2015

I am getting an "expression did not evaluate to a constant" error on line 217 in Visual Studio 2015 Update 3:

213    template<typename T>
214    struct EntryDefaultTable
215    {
216        static constexpr const sherwood_v3_entry_constexpr<T> table[min_lookups] =
217        {
218            {}, {}, {}, sherwood_v3_entry_constexpr<T>::special_end_entry()
219        };
220    };

I checked and Microsoft's constexpr support may be problematic for some kind of usages. They already fixed some bugs. So I was wondering if there is a workaround to this piece of code. Another possibility is may be I am missing a compile flag. Thanks.

Use uint64_t instead of size_t?

Code is using size_t in various places assuming it is at least 64 bit in size. This is not always true. Any chance we could get this more 32bit-friendly?

bytell_hash_map: Duplicate item found while traversing map after calling erase()

Hello, while using bytell_hash_map I came across this issue: after erasing an item while traversing the map, I get the same item two times. Here is a small repro case:


    typedef ska::bytell_hash_map<const void*, int> StaticObjectsToUpdate;
    //typedef std::unordered_map<const void*, int> StaticObjectsToUpdate;
    StaticObjectsToUpdate testMap;
    int iteration = 0;
    while (1)
    {
        iteration++;
        // Insert some items in the map
        for (int i = 0; i < 10; i++)
        {
            void *ptr = malloc(4);
            testMap[ptr] = 3;
        }
        // Process
        std::vector<const void *> tmpList;
        for (auto it = testMap.begin(); it != testMap.end();)
        {
            tmpList.push_back(it->first);
            auto& nbFrames = it->second;
            if (--nbFrames == 0)
            {
                it = testMap.erase(it);
            }
            else
            {
                it++;
            }
        }

        // Test consistency : look for duplicates
        for (int i = 0; i < tmpList.size(); i++)
        {
            for (int j = i + 1; j < tmpList.size(); j++)
            {
                if (tmpList[i] == tmpList[j])
                {
                    printf("iteration %d: same items found : %x.\n", iteration, tmpList[i]);  // Should not happen
                    exit(0);
                }
            }
        }
    }

The issue does not happen when using flat_hash_map or unordered_map .

SIGSEGV when using "-O2 -DNDEBUG" compile options

I'm getting the following error with compiler optimizations:

__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007ffff66af801 in __GI_abort () at abort.c:79
#2  0x00007ffff66f8897 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7ffff6825b9a "%s\n") at ../sysdeps/posix/libc_fatal.c:181
#3  0x00007ffff66ff90a in malloc_printerr (str=str@entry=0x7ffff6823d88 "free(): invalid pointer") at malloc.c:5350
#4  0x00007ffff6706e1c in _int_free (have_lock=0, p=0x7fffeffff138 <viya::util::tu_names+216>, av=0x7ffff6a5ac40 <main_arena>) at malloc.c:4157
#5  __GI___libc_free (mem=0x7fffeffff148 <ska::detailv8::sherwood_v8_block<std::pair<Dimensions, unsigned long>, (unsigned char)8>::empty_block()::empty_bytes>)
    at malloc.c:3124
#6  0x00007fffefb78513 in std::pair<ska::detailv8::sherwood_v8_table<std::pair<Dimensions, unsigned long>, Dimensions, DimensionsHasher, ska::detailv3::KeyOrValueHasher<Dimensions, std::pair<Dimensions, unsigned long>, DimensionsHasher>, std::equal_to<Dimensions>, ska::detailv3::KeyOrValueEquality<Dimensions, std::pair<Dimensions, unsigned long>, std::equal_to<Dimensions> >, std::allocator<std::pair<Dimensions, unsigned long> >, std::allocator<unsigned char>, (unsigned char)8>::templated_iterator<std::pair<Dimensions, unsigned long> >, bool> ska::detailv8::sherwood_v8_table<std::pair<Dimensions, unsigned long>, Dimensions, DimensionsHasher, ska::detailv3::KeyOrValueHasher<Dimensions, std::pair<Dimensions, unsigned long>, DimensionsHasher>, std::equal_to<Dimensions>, ska::detailv3::KeyOrValueEquality<Dimensions, std::pair<Dimensions, unsigned long>, std::equal_to<Dimensions> >, std::allocator<std::pair<Dimensions, unsigned long> >, std::allocator<unsigned char>, (unsigned char)8>::emplace_direct_hit<Dimensions&, unsigned long>(ska::detailv8::sherwood_v8_table<std::pair<Dimensions, unsigned long>, Dimensions, DimensionsHasher, ska::detailv3::KeyOrValueHasher<Dimensions, std::pair<Dimensions, unsigned long>, DimensionsHasher>, std::equal_to<Dimensions>, ska::detailv3::KeyOrValueEquality<Dimensions, std::pair<Dimensions, unsigned long>, std::equal_to<Dimensions> >, std::allocator<std::pair<Dimensions, unsigned long> >, std::allocator<unsigned char>, (unsigned char)8>::LinkedListIt, Dimensions&, unsigned long&&) ()

This issue doesn't happen in debug mode.

C++17-compliant try_emplace

We are currently evaluating the bytell_hash_map in our research database and found that there are situations where moving away from the C++17-introduced try_emplace to use the bytell_hash_map hurts performance in some place.

Is there a plan to update the map to fully implement the current C++17 interface?

Compiler warning: Null pointer arithmetic has undefined behavior

unordered_map.hpp:425:46: warning: performing pointer arithmetic on a null pointer has undefined behavior if the offset is nonzero [-Wnull-pointer-arithmetic]
        *new_buckets = EntryPointer(nullptr) + ptrdiff_t(1);

That line looks harmless to me, but undefined behavior should be avoided if at all possible.

SIGSEGV when using "

Can you show you benchmarks code?

consider adjust max_lookups linear to the number elements.

flat_hash_map done a talent job, i'm considering using it on my project. but i'm very concerns about bad hash function explode the memory, is it possible to set max_lookups linear to the number of elements? for example max_lookups = num_elements / 8.

optimized clear for trivially_destructable entries

Since distance_from_desired is int8_t, this works as expected.

private:
	inline void clear(std::true_type)
	{
		std::memset(entries, -1, num_elements * sizeof(Entry));
		num_elements = 0;
	}

	inline void clear(std::false_type)
	{
		for (EntryPointer it = entries, end = it + static_cast<ptrdiff_t>(num_slots_minus_one + max_lookups); it != end;
				++it)
		{
			if (it->has_value())
				it->destroy_value();
		}
		num_elements = 0;
	}
	
public:
	void clear()
	{
		this->clear(std::is_trivially_destructible<T>::type{});
	}

benchmarks and unpublished hash tables

Hi, are the benchmarks that you used to compare your hash tables against the others and the unpublished hash tables (like your implementation of googles swiss hash table) available somewhere? I'd like to be able to reproduce your results.

jump_distances won't work with x86 builds

flat_hash_map/bytell_hash_map.hpp

Line 52 in 2c46874

static constexpr size_t jump_distances[num_jump_distances]

Merge method

Can you please add a merge method to merge 2 bytell unordered maps? I need it for a project.

Iteration Order

currently when you iterate over the nodes in the map, the order they are returned is not the order the were inserted into the map in.

is there some easy tweak to ensure these orderings match up?

Bug

Line 1218 in flat_hash_map.hpp. why std::end - 1 and not just std::end?

Benchmark against tsl::hopscotch map

Benchmark against @Tessil 's hopscotch-map.

/u/rtessil wrote here:

I Wrote The Fastest Hashtable
Really interesting article, but the title is a bit of a stretch no?

I quickly compared ska::flat_hash_map (with prime and power of two) to the tsl::hopscotch_map I wrote.
The performances are roughly the same (a bit faster for the power of two version on integers but slower on small std::string, no indirection as they are small enough for SSO).
The main difference is that ska::flat_hash_map and ska::flat_hash_map with power of two need, respectively, ~50% and ~100% more memory than tsl::hopscotch_map (raw results:

And provided these benchmarks (description):

# For a description of each test see https://tessil.github.io/2016/08/29/benchmark-hopscotch-map.html
# bench type, nb elements, hash_map, vsz memory, rss memory, nb seconds
sequential,5000000,hopscotch_map,214777856,204021760,0.205127
sequential,5000000,ska_flat_hash_map,329515008,318545920,0.295789
sequential,5000000,ska_flat_hash_map_power_of_two,416067584,405176320,0.333548
sequential,10000000,hopscotch_map,416104448,405319680,0.407490
sequential,10000000,ska_flat_hash_map,645595136,634777600,0.587404
sequential,10000000,ska_flat_hash_map_power_of_two,818720768,807858176,0.662674
sequentialread,5000000,hopscotch_map,214777856,203988992,0.023264
sequentialread,5000000,ska_flat_hash_map,329515008,318726144,0.023356
sequentialread,5000000,ska_flat_hash_map_power_of_two,416067584,405282816,0.015604
sequentialread,10000000,hopscotch_map,416104448,405188608,0.047545
sequentialread,10000000,ska_flat_hash_map,645595136,634810368,0.046600
sequentialread,10000000,ska_flat_hash_map_power_of_two,818720768,807870464,0.031345
randomshufflerange,5000000,hopscotch_map,214777856,204120064,0.565164
randomshufflerange,5000000,ska_flat_hash_map,329515008,318472192,0.561359
randomshufflerange,5000000,ska_flat_hash_map_power_of_two,416067584,405020672,0.611901
randomshufflerange,10000000,hopscotch_map,416104448,405102592,1.149926
randomshufflerange,10000000,ska_flat_hash_map,645595136,634875904,1.136794
randomshufflerange,10000000,ska_flat_hash_map_power_of_two,818720768,807903232,1.236914
randomshufflerangeread,5000000,hopscotch_map,214777856,203952128,0.111159
randomshufflerangeread,5000000,ska_flat_hash_map,329515008,318545920,0.112118
randomshufflerangeread,5000000,ska_flat_hash_map_power_of_two,416067584,405213184,0.091952
randomshufflerangeread,10000000,hopscotch_map,416104448,405262336,0.228280
randomshufflerangeread,10000000,ska_flat_hash_map,645595136,634765312,0.228594
randomshufflerangeread,10000000,ska_flat_hash_map_power_of_two,818720768,807649280,0.189817
randomfull,5000000,hopscotch_map,214777856,203980800,0.741374
randomfull,5000000,ska_flat_hash_map,329515008,318652416,0.848784
randomfull,5000000,ska_flat_hash_map_power_of_two,416067584,405086208,0.721192
randomfull,10000000,hopscotch_map,416104448,405307392,1.511382
randomfull,10000000,ska_flat_hash_map,645595136,634781696,1.775507
randomfull,10000000,ska_flat_hash_map_power_of_two,818720768,807870464,1.461587
randomfullread,5000000,hopscotch_map,214777856,203935744,0.166661
randomfullread,5000000,ska_flat_hash_map,329515008,318492672,0.155278
randomfullread,5000000,ska_flat_hash_map_power_of_two,416067584,405176320,0.125470
randomfullread,10000000,hopscotch_map,416104448,405381120,0.348137
randomfullread,10000000,ska_flat_hash_map,645595136,634703872,0.331349
randomfullread,10000000,ska_flat_hash_map_power_of_two,818720768,807993344,0.275156
randomfullreadmiss,5000000,hopscotch_map,214777856,204025856,0.183873
randomfullreadmiss,5000000,ska_flat_hash_map,329515008,318795776,0.184358
randomfullreadmiss,5000000,ska_flat_hash_map_power_of_two,416067584,405172224,0.158098
randomfullreadmiss,10000000,hopscotch_map,416104448,405348352,0.394217
randomfullreadmiss,10000000,ska_flat_hash_map,645595136,634699776,0.375168
randomfullreadmiss,10000000,ska_flat_hash_map_power_of_two,818720768,808001536,0.329822
iteration,5000000,hopscotch_map,214777856,203878400,0.044455
iteration,5000000,ska_flat_hash_map,329515008,318664704,0.069848
iteration,5000000,ska_flat_hash_map_power_of_two,416067584,405041152,0.073894
iteration,10000000,hopscotch_map,416104448,405123072,0.089261
iteration,10000000,ska_flat_hash_map,645595136,634687488,0.140058
iteration,10000000,ska_flat_hash_map_power_of_two,818720768,807886848,0.148142
delete,5000000,hopscotch_map,214777856,204013568,0.221552
delete,5000000,ska_flat_hash_map,329515008,318623744,0.194737
delete,5000000,ska_flat_hash_map_power_of_two,416067584,405102592,0.155010
delete,10000000,hopscotch_map,416104448,405127168,0.476158
delete,10000000,ska_flat_hash_map,645595136,634785792,0.427231
delete,10000000,ska_flat_hash_map_power_of_two,818720768,807890944,0.363406
insertsmallstring,5000000,hopscotch_map,416100352,405565440,1.860058
insertsmallstring,5000000,ska_flat_hash_map,645595136,634609664,2.281131
insertsmallstring,5000000,ska_flat_hash_map_power_of_two,818720768,807956480,2.103567
insertsmallstring,10000000,hopscotch_map,818753536,808095744,3.881970
insertsmallstring,10000000,ska_flat_hash_map,1277755392,1266814976,4.807635
insertsmallstring,10000000,ska_flat_hash_map_power_of_two,1624027136,1613348864,4.331490
readsmallstring,5000000,hopscotch_map,416100352,405331968,1.129693
readsmallstring,5000000,ska_flat_hash_map,645595136,634679296,1.181751
readsmallstring,5000000,ska_flat_hash_map_power_of_two,818720768,807985152,1.220172
readsmallstring,10000000,hopscotch_map,818753536,807899136,2.476885
readsmallstring,10000000,ska_flat_hash_map,1277755392,1266974720,2.573642
readsmallstring,10000000,ska_flat_hash_map_power_of_two,1624027136,1613090816,2.565422
readsmallstringmiss,5000000,hopscotch_map,416100352,405188608,1.123353
readsmallstringmiss,5000000,ska_flat_hash_map,645595136,634871808,1.166329
readsmallstringmiss,5000000,ska_flat_hash_map_power_of_two,818720768,808042496,1.174861
readsmallstringmiss,10000000,hopscotch_map,818753536,808058880,2.341009
readsmallstringmiss,10000000,ska_flat_hash_map,1277755392,1266839552,2.549118
readsmallstringmiss,10000000,ska_flat_hash_map_power_of_two,1624027136,1613049856,2.503907
deletesmallstring,5000000,hopscotch_map,416100352,405569536,1.176642
deletesmallstring,5000000,ska_flat_hash_map,645595136,634609664,1.198479
deletesmallstring,5000000,ska_flat_hash_map_power_of_two,818720768,808034304,1.175356
deletesmallstring,10000000,hopscotch_map,818753536,807882752,2.451259
deletesmallstring,10000000,ska_flat_hash_map,1277755392,1267023872,2.659608
deletesmallstring,10000000,ska_flat_hash_map_power_of_two,1624027136,1613295616,2.585078
insertstring,5000000,hopscotch_map,814391296,803540992,3.284603
insertstring,5000000,ska_flat_hash_map,1043988480,1033060352,3.503130
insertstring,5000000,ska_flat_hash_map_power_of_two,1217126400,1206075392,3.535146
insertstring,10000000,hopscotch_map,1617141760,1606082560,6.957676
insertstring,10000000,ska_flat_hash_map,2076110848,2065018880,7.598840
insertstring,10000000,ska_flat_hash_map_power_of_two,2422394880,2411331584,7.648912
readstring,5000000,hopscotch_map,814391296,803512320,2.277258
readstring,5000000,ska_flat_hash_map,1043988480,1033179136,2.329161
readstring,5000000,ska_flat_hash_map_power_of_two,1217126400,1206403072,2.267905
readstring,10000000,hopscotch_map,1617141760,1606356992,5.113229
readstring,10000000,ska_flat_hash_map,2076110848,2065104896,5.224593
readstring,10000000,ska_flat_hash_map_power_of_two,2422394880,2411315200,4.914819
readstringmiss,5000000,hopscotch_map,814391296,803504128,1.778059
readstringmiss,5000000,ska_flat_hash_map,1043988480,1033183232,1.759323
readstringmiss,5000000,ska_flat_hash_map_power_of_two,1217126400,1206112256,1.700586
readstringmiss,10000000,hopscotch_map,1617141760,1606344704,3.066333
readstringmiss,10000000,ska_flat_hash_map,2076110848,2065137664,3.337325
readstringmiss,10000000,ska_flat_hash_map_power_of_two,2422394880,2411384832,3.240746
deletestring,5000000,hopscotch_map,814391296,803745792,2.398370
deletestring,5000000,ska_flat_hash_map,1043988480,1032884224,2.618392
deletestring,5000000,ska_flat_hash_map_power_of_two,1216991232,1206231040,2.339883
deletestring,10000000,hopscotch_map,1617141760,1606242304,5.210573
deletestring,10000000,ska_flat_hash_map,2076110848,2065301504,5.291985
deletestring,10000000,ska_flat_hash_map_power_of_two,2422394880,2411339776,5.238803

Test for optimizable prime numbers

I really enjoyed your article. Here is an idea for a further improvement (for lack of a better place to post it.)

For each prime_index you could automatically test several prime numbers around the ideal value, to see which one can be optimized best by the compiler.

For example, what if the integer modulo operation using one of the other primes near 14480561146010017169ll, let's say among the ones that are less than 1% off from the ideal value—which will be a lot of primes—what if one of them can be compiled into x64 assembly code that is twice as fast? 10 times faster? Wouldn't it make sense to pick that as the compile-time default?

Considering the black magic that goes on in these kinds of compiler optimizations, it wouldn't surprise me to see a huge variation in the assembly code generated for different primes (and by different compilers too.)

Can I use this code in my project?

Is there the project license?

reinitialize hasher on growth

@skarupke would it be possible to reinitialize the hasher on growth? (e.g. by re-assigning a default constructing hasher to it?)

For state-less hashers, this would be a no-op.

But I think (worth checking) that this would allow an stateful hasher, with a dynamically initialized seed, to change seed after each growth. This would make exploiting the primes used to construct a malicious input sequence (which is already very hard) impractical without full access to the target memory.

Not that this hash table should have as an objective to be "DOS resistant" (and there are probably better hash tables for that), but it is always nice to give the user a choice. Comparing this one with a stateful hasher against secure-by-design hash-tables would make for another excellent blogpost of yours.

Beat the fastest implementation of the benchmark's game k-nucleotide benchmark

Scott wrote here:

It would be nice to add some more maps to the tests because it doesn’t seem to be the fastest one.

I took the k-nucleotide test which has a heavy read load on a hash map (http://benchmarksgame.alioth.debian.org/u64q/program.php?test=knucleotide&lang=gpp&id=3) and changed the `unsigned N = sysconf (_SC_NPROCESSORS_ONLN);` to `unsigned N = 1;` to make it runs on one core only.

g++ -O3 -std=c++14 knucleotide.cpp -o knucleotide -lpthread
time ./knucleotide < input25000000.txt

__gnu_pbds::cc_hash_table: 0m18.297s
std::unordered_map: 0m38.767s
tsl::hopscotch_map (seems to do well from a post on reddit): 0m19.288s
ska::flat_hash_map: 0m25.177s
ska::flat_hash_map power of two: 0m24.454s

Couldn't make it work with google::dense_hash_map.

a bug in emplace_new_key() at line 824 flat_hash_map.hpp

I have learned your code, and is there a bug in emplace_new_key() at line 824 flat_hash_map.hpp? I'm not sure, pls help to check.
i think the logic of emplace_new_key(int8_t distance_from_desired, EntryPointer current_entry, Key && key, Args &&... args) is to put args to current_entry and find a new bucket for current_entry->value. At line 861, you swap to_insert with result.current->value, code as below:
if (distance_from_desired == max_lookups)
{
swap(to_insert, result.current->value);
grow();
return emplace(std::move(to_insert));
}
may be logic is wrong in following case. Given there are 7 buckets, the content of each bucket as below:

bucket id	content
bucket1	DIB is 0, value is A
bucket2	DIB is 1, value is B
bucket3	DIB is 0, value is C
bucket4	DIB is 1, value is D
bucket5	DIB is 2, value is E
bucket6	DIB is 3, value is F
bucket7	DIB is 4, value is G

Given param 'current_entry' of emplace_new_key() is point to bucket1, to_insert value is X, and max_lookups is 4. so the code logic is

swap(to_insert, current_entry->value); so bucket1's value is X, and to_insert value is A, iterator result point to bucket1.
current_entry shift to bucket2, but DIB(A) is 1, so move to next;
current_entry shift to bucket3, swap(to_insert, current_entry->value), now bucket3's value is A, and to_insert value is C.
find a new bucket to store C, and program will reach condition 'distance_from_desired == max_lookups' and run to line 861, now swap(to_insert, result.current->value) will store C to bucket1, i think it's a bug to store C to bucket1. For value C, it can't store in bucket before bucket3.

No license file.

I could make a PR that included the boost software license as is mentioned in the header.

A Readme.md wouldn't hurt either. You could link back to your blog:

https://probablydance.com/2017/02/26/i-wrote-the-fastest-hashtable/

https://probablydance.com/2018/05/28/a-new-fast-hash-table-in-response-to-googles-new-fast-hash-table/

If you want I could whip up a PR with these files. Just let me know.

Insert invalidate iterators

According to the C++ standard, all Associative Containers: The insert and emplace members shall not affect the validity of iterators and references to the container [26.2.6/9].
However I found that after insertion of of a list of element, some iterator values are invalidated.

Please find bellow code snippet to test this feature:

namespace xxx {
typedef std::string key_t;
typedef std::string value_t;

typedef ska::flat_hash_map<key_t, value_t> map_t;
//typedef std::map<key_t, value_t> map_t;
typedef std::list<map_t::iterator> list_t;

}
...
std::vector keys(MAX_SIZE);
generatKeys(keys);
...
for (auto& key : keys) {
std::pair < xxx::map_t::iterator, bool> ret = dict.insert(xxx::map_t::value_type(key, keys[rand() % MAX_SIZE]));

        if (ret.second) {
            list_items.push_back(ret.first);
        }
    }
    
    for (xxx::map_t::iterator iter = dict.begin(); iter != dict.end(); ++iter) {
   
        xxx::list_t::iterator findIter = std::find(list_items.begin(), list_items.end(), iter);
        
        if(findIter == list_items.end()) {
            cerr << "Item does not exist\n";
        } else {
            cerr << "Item exists\n";
        }
    }

missing a fast_hash_map::merge

The merge() function in part of the std::unordered_map interface since C++17. https://en.cppreference.com/w/cpp/container/unordered_map/merge

Is there an implementation-specific trick that can speed up merging two fast_hash_map tables, compared to the iteration+insertion loop? This operation can be destructive to the source container.

Some context: I am dealing with a large amount of relatively small (~ 4kb, page size) tables that regularly get merged and cleared (without dealloc). This happens enough that the performance of the merge is critical.

template_iterator iterator do not always return a value

Hello,

Is it intended?

            iterator begin()
            {
                for (EntryPointer it = entries;; ++it)
                {
                    if (it->has_value())
                        return { it };
                }

                // No return value here
            }

I believe it should at least return end() or cend() depending of the constness of the iterator

see #8

No data No bb

Corner case issue when hash collision with bad hashing

Each time there is a hash collision, the set doubles its size (same problem with the map)
See the simple sample code below.
note: no such problem with ska::unordered_set

void main()
{
	struct hash_t
	{
		size_t operator()(int) const { return 1; };
	};
	ska::flat_hash_set<int, hash_t> set;

	for (int i = 0; i < 30; ++i)
	{
		std::cout << set.bucket_count() << std::endl;
		set.insert(i);
	}
}

Printout:
4
4
8
8
32
64
128
256
512
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216
33554432
67108864
134217728

Unsafe with attacker controlled keys

It's possible to trigger explosive memory usage, when carefully picking keys, which causes the hash table to grow for every insertion eventually resulting in an out of memory condition.

Pseudo code:

static constexpr size_t jump_distances[]
{
  0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,

  21, 28, 36, 45, 55, 66, 78, 91, 105, 120, 136, 153, 171, 190, 210, 231,
  253, 276, 300, 325, 351, 378, 406, 435, 465, 496, 528, 561, 595, 630,
  666, 703, 741, 780, 820, 861, 903, 946, 990, 1035, 1081, 1128, 1176,
  1225, 1275, 1326, 1378, 1431, 1485, 1540, 1596, 1653, 1711, 1770, 1830,
  1891, 1953, 2016, 2080, 2145, 2211, 2278, 2346, 2415, 2485, 2556,

  3741, 8385, 18915, 42486, 95703, 215496, 485605, 1091503, 2456436,
  5529475, 12437578, 27986421, 62972253, 141700195, 318819126, 717314626,
  1614000520, 3631437253, 8170829695, 18384318876, 41364501751,
  93070021080, 209407709220, 471167588430, 1060127437995, 2385287281530,
  5366895564381, 12075513791265, 27169907873235, 61132301007778,
  137547673121001, 309482258302503, 696335090510256, 1566753939653640,
  3525196427195653, 7931691866727775, 17846306747368716,
  40154190394120111, 90346928493040500, 203280588949935750,
  457381324898247375, 1029107980662394500, 2315492957028380766,
  5209859150892887590,
};

// assume STL has a crappy hash function
struct hasher { size_t operator()(const size_t& _Keyval) const { return _Keyval; } };

template<typename T>
  size_t invert_fibonacci_hash(T &a, size_t y) {
    // return a value x, so that fibonacci_hash_policy.index_for_hash(x) returns y
    // i don't provide the implementation here, but if you change
    // fibonacci_hash_policy::index_for_hash to "return hash & num_slots_minus_one;"
    // you can just return y here.
  }

  ska::bytell_hash_map<size_t, int, hasher> hashmap;

  // Consumes around 2GB after 19 iterations and crashes with out of memory
  for(size_t i = 8; i < 32; i++) {
    for (size_t j = 0; j < 126; j++)
      hashmap[invert_fibonacci_hash(hashmap, jump_distances[j])] = 1;
    hashmap[invert_fibonacci_hash(hashmap, 1 << i)] = 1;
  }

Consider further optimisation on `mod_function`.

Supposing your hash values are uniformly distributed "random" values in the range of 0..2**64-1, you can skip the modulo and instead use random-number scaling techniques to distribute the result (the bucket number) evenly throughout that space. I don't think there's any reason to stick with modulo, is there?

The quickest-and-dirtiest of these would be:

size_t bucket(uint64_t hash) {
    return uint64_t(uint128_t(hash) * K >> 64);
}

But if the incoming hashes are not already uniform then try this one weird trick:

size_t bucket(uint64_t hash) {
     crc32(hash) * K >> 32;
}

(which only works when K is less than 2**32), and you'll have to find the right intrinsic for the target to get the hardware-accelerated CRC instruction)

This one is more thorough:

static inline uint64_t murmurmix64(uint64_t h) {
    h ^= h >> 33;
    h *= 0xff51afd7ed558ccdULL;
    h ^= h >> 33;
    h *= 0xc4ceb9fe1a85ec53ULL;
    h ^= h >> 33;

    return h;
}

size_t bucket(uint64_t hash) {
    hash = murmurmix64(hash);
    return uint64_t(uint128_t(hash) * K >> 64);
}

And this one is probably fine, covers up to 64-bit ranges, and is probably still faster than mod by a constant:

size_t bucket(uint64_t hash) {
    hash *= 0xc4ceb9fe1a85ec53ULL;
    return uint64_t(uint128_t(hash) * K >> 64);
}

Plus it probably makes no difference if K is not constant, and so you can avoid calling via a function pointer and just load the table size into a register and perform the multiply by that inline.