Git Product home page Git Product logo

murmurhash.c's Introduction

murmurhash

tests

MurmurHash3 general hash bashed lookup function implementation

about

MurmurHash is a non-cryptographic hash function suitable for general hash-based lookup. This implementation implements version 3 of MurmurHash.

install

clib:

$ clib install jwerle/murmurhash.c

source:

$ git clone [email protected]:jwerle/murmurhash.c.git
$ cd murmurhash.c
$ make
$ make install

example

#include <stdlib.h>
#include <string.h>
#include <murmurhash.h>

int
main (void) {
  uint32_t seed = 0;
  const char *key = "kinkajou";
  uint32_t hash = murmurhash(key, (uint32_t) strlen(key), seed); // 0xb6d99cf8
  return 0;
}

A command line executable is also available:

$ echo -n kinkajou | murmur
3067714808
$ echo -n panda | murmur --seed=10
1406483717

api

uint32_t
murmurhash (const char *key, uint32_t len, uint32_t seed);

Returns a murmur hash of key based on seed using the MurmurHash3 algorithm.

license

MIT

murmurhash.c's People

Contributors

jwerle avatar stephenmathieson avatar chenxuqiang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar tomat avatar Vlad Shell avatar eeliu avatar Kevin James avatar  avatar Alex Carver avatar  avatar  avatar Jacob avatar Zhaolin Ma avatar  avatar zzh avatar Yiwei Yang avatar Saeed avatar Oluwatobiloba Johnson avatar Furkan Sarıhan avatar Matthieu CORREIA avatar happy365 avatar Mingdao Liu avatar Andrew Bell avatar Mutalisk avatar jinhua luo avatar 0x7C9A avatar Hiperión avatar Jindong Zhang avatar Daria Preda avatar  avatar Rick Blundell avatar Solomon Sklash avatar Lloyd Zhou avatar  avatar Tosone avatar  avatar  avatar Emin Gadzhiev avatar Rhys avatar Matthew Hughes avatar CHEN Xiang avatar  avatar Philipp Beisel avatar Franken avatar Denis Blank avatar  avatar  avatar Headson avatar YangJun avatar Andy avatar yandaren avatar  avatar fanday avatar Sleepy Monax avatar Eric Jacobsen avatar  avatar Vladimir Gamalyan avatar Moka Schitta avatar Alone_Monkey avatar jmpews(AKA.zz) avatar  avatar  avatar genuine_ avatar Jake Besworth avatar Apostol Dadachev avatar ziyht avatar Max Justus Spransy avatar 程飞 avatar  avatar Kenneth Lim avatar Shrayas Rajagopal avatar heapwolf avatar  avatar William Casarin avatar  avatar

Watchers

James Cloos avatar  avatar

murmurhash.c's Issues

Murmur hash function test results are not as expected ?

What OS are you using (uname -a, or Windows version)?

Linux hzscn008 5.10.27-051027-generic #202103310028 SMP Thu Apr 1 02:16:48 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

What programming language are you using (C/C++/Go/Rust)?

C++

g++ --version
g++ (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

What did you expect to see and what you saw instead?

I've tested your murmurhash.c, which implements the Murmur Hash algorithm. My testing approach involves converting IPv4 addresses to 32-bit integers. After applying the hash function and modulo operator, I place them into separate buckets. Subsequently, I perform statistical analysis on the bucket counts and check for deviations.
Here's my test code,save it as test_murmur.cpp,

#include <cstdio>
#include <cstdlib>
#include <cstring>
#include "murmurhash.h"
#include <sys/socket.h>
#include <arpa/inet.h>
#include <map>
#include <iostream>
#include <numeric>
#include <algorithm>



int main (int argc, char **argv) 
{
  if (argc != 2)
  {
    printf("Usage: %s buckets_num\n", argv[0]);
    return 1;
  }
  int port_num = atoi(argv[1]);
  struct in_addr netAddr;
  const char *ip_address_str = "192.168.8.1" ;
  if (inet_pton(AF_INET, ip_address_str, &netAddr) <= 0) {
      perror("inet_pton");
      return 1;
  }
  uint32_t start = ntohl(netAddr.s_addr);
  uint32_t hash_value = 0;
  char *ip = NULL;
  std::map<uint32_t, uint32_t> myMap;
  uint32_t port = 0;

  for (uint32_t i = start; i< start+100000;i++)
  {
    hash_value = murmurhash((const char *)&i, sizeof(i), 0x8edd);
    port = hash_value % port_num;
    myMap[port]++;
  }

  //get max
  auto it_max = std::max_element(myMap.begin(), myMap.end(), [](const auto& lhs, const auto& rhs) {
      return lhs.second < rhs.second;
  });

  //get min
  auto it_min = std::min_element(myMap.begin(), myMap.end(), [](const auto& lhs, const auto& rhs) {
      return lhs.second < rhs.second;
  });
  // get total
  int sum = std::accumulate(myMap.begin(), myMap.end(), 0, [](int acc, const auto& p) {
      return acc + p.second;
  });

  size_t size = myMap.size();

  // get average
  double average = sum / size;

  std::cout <<"positive deviation: " << (it_max->second- average)/average*100 << "%" <<std::endl;
  std::cout <<"negative deviation: " << (it_min->second- average)/average*100 << "%" <<std::endl;

  return 0;
}

To compile:

g++ -o test_murmur test_murmur.cpp  murmurhash.c  -std=c++17

To run:

./test_murmur 512   # (buckets number)

Since the Murmur algorithm has passed chi-squared and avalanche tests, I initially assumed that the counts of each bucket would be almost the same, resulting in a deviation close to zero. However, as the total number of buckets increases, the deviation becomes significantly larger. For instance, when the number of buckets is set to 512, the deviation is as follows:

Positive deviation: 18.9744%
Negative deviation: -21.0256%

Do you have any insights on my test results? I don't believe the deviation is acceptable. What do you think? Thanks in advance .

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.