Git Product home page Git Product logo

getting-started-with-mmdb's Introduction

Deprecated

The Perl MaxMind-DB-Writer distribution is no longer being developed. We recommend that you use the Go github.com/maxmind/mmdbwriter module instead. See our blog post on using the Go writer for more information.

Installation

Vagrant

Getting a License Key

In order to download your databases, create a .env file in the root of this directory.

cp .env-sample .env

Edit the .env file and replace the boilerplate license key with your own. If you need to generate a license key, log in to your MaxMind.com account (or create an account first) and generate a new license key by clicking "My License Key" on the left hand menu.

vagrant up

If you use Vagrant, you can get started easily. After cloning this repository, issue the following command from the top level of the repository:

vagrant up

If you are starting this Vagrant VM for the first time, you might want to make yourself a sandwich. Depending on your setup it could take 6-10 minutes before your VM is ready. Once the provisioning is finished, you can log in to your VM and start running commands.

vagrant ssh
cd /vagrant
perl examples/01-getting-started.pl

If you've followed the instructions above, you are ready to go.

perl examples/01-getting-started.pl

That's it! Read on if you want to install things manually.

Re-provision

If your vagrant up does not run to completion you can re-run it via vagrant provision. If you are upgrading from an earlier version of this repository, you'll want to rm -rf local inside this repository first, so that you'll get a fresh install of Perl modules.

Manual Installation

Caveat for Windows Users

MaxMind::DB::Writer is not currently supported on Windows Operating Systems. If you're in a Windows environment, you may want to try setting up the Vagrant VM by following the instructions above.

Perl

You'll need Perl to run the example code. Unless you're in a Windows environment, you probably already have Perl installed. A minimum version of 5.14 is enough to get started with. You can check your version via perl --version.

libmaxminddb

Before installing any Perl modules you'll need to install libmaxminddb.

cpanminus

cpm is probably the easiest Perl install tool to get up and running with. If you don't already have it, you can install it with a one-liner:

curl -fsSL --compressed https://git.io/cpm > /usr/local/bin/cpm
chmod +x /usr/local/bin/cpm
cpm --version

We've chosen to install without sudo, so that we don't interfere with any modules which the system requires.

CPAN Modules

Now that we have a tool to install our Perl modules, let's go ahead and install the modules we need to write an MMDB file. I should add the caveat that we don't currently have Windows support for our writer, so you'll need access to a *nix or Mac OS X environment to play along. If you do have a Windows machine, an Ubuntu VM or something similar will be just fine.

cpm install --cpanfile cpanfile

If you're on Mac OS X and the above install fails, you can try forcing a 64 bit architecture:

ARCHFLAGS="-arch x86_64" cpm install MaxMind::DB::Writer::Tree Net::Works::Network

Now you're ready to start running scripts:

perl examples/01-getting-started.pl

GeoLite2-City

You'll need a copy of GeoLite2-City.mmdb somewhere on your filesystem. You can download this file manually. If you need more details on how we set this up, you can look at the provision section of the Vagrantfile in the GitHub repository.

getting-started-with-mmdb's People

Contributors

autarch avatar danroscigno avatar nchelluri avatar oalders avatar oschwald avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

getting-started-with-mmdb's Issues

Getting weird network masking in results

I am getting weird results from running 03-iterate-serach-tree.pl especially from line 22.

say join '/', $address->as_ipv4_string, $mask_length;

I know of two format that use similar notation, Classless Inter Domain Routing (CIDR) and Variable Length Subnet Mask (VLSM). However the numbers after the / is usually less than 32. However in this case it goes up to 120.
Some more results from the latest GeoLite2-City.mmdb :

1.0.0.0/120
1.0.1.0/120
1.0.2.0/119
1.0.4.0/118
1.0.8.0/117
1.0.16.0/116
1.0.32.0/115
1.0.64.0/114
1.0.128.0/113
1.1.0.0/120
1.1.1.0/120
1.1.2.0/119
1.1.4.0/118
1.1.8.0/117
1.1.16.0/116
1.1.32.0/115
1.1.64.0/114
1.1.128.0/113
1.2.0.0/119
1.2.2.0/120

My networking background is very limited so I can't argue if this format is right or wrong, But according more credible people, network masking can on be /0 to /32. https://networkengineering.stackexchange.com/questions/51636/what-format-is-1-34-128-0-113/51639#51639

Minify GeoLite2-City

Hi

I find the mmdb format pretty great and am now using it in Nginx to proxy various variables to backend services.

I want to minify the GeoLite2-City database to only contain a small number of variables. The database now is great but also big due to having city names in other languages fx.

I have gotten this far:

my %address_for_employee = (
    '0.0.0.0/0' => {
    }
);

for my $range ( keys %address_for_employee ) {

    my $user_metadata = $address_for_employee{$range};

    # Iterate over network and insert IPs individually
    my $network = Net::Works::Network->new_from_string( string => $range );
    my $iterator = $network->iterator;

    while ( my $address = $iterator->() ) {

        my $ip = $address->as_ipv4_string;

        my $model = $reader->city( ip => $ip );

        print $model;
    }
}

However:

  • it's not working
  • it looks up the specific IP's and I still want the network aggregation (I can now see in another issue that the writer will take care of de-duplication)

So lets break it down:

  • I want to remove all but 'EN'
  • I want to remove continent, registrered country, subdivisions etc.
  • I want to keep the structure it has, but "minify it" by removing (for me) excess data

Do you have any idea on how to get this done and can you provide an example that's working? This would also be a great example to add to this repository.

Vagrant up getting stuck at user input

Hi,

I have an issue with interactive vagrant setup and wanted to share how I solved it.

OS: macOS Catalina

Running vagrant up gets stuck due to waiting user input

default: Package configuration┌──────────────────────┤ Configuring libssl1.1:amd64
 ├──────────────────────┐│││ There are services installed on your system which need to be restarted
    ││ when certain libraries, such as libpam, libc, and libssl, are upgraded.   ││ Since these restarts may cause interruptions of 
service for the system,   ││ you will normally be prompted on each upgrade for the list of services    ││ you wish to restart.  You
 can choose this option to avoid being││ prompted; instead, all necessary restarts will be done for you││ automatically so you
 can avoid being asked questions on each library││ upgrade.││││ Restart services during package upgrades without asking?
││││<Yes<No>│││└───────────────────────────────────────────────────────────────────────────┘l.deb ...

I wasn't able to select Yes or No at this point.

After inserting the following to Vagrantfile at line 70, vagrant up doesn't wait on user input and finishes successfully.

echo '* libraries/restart-without-asking boolean true' | sudo debconf-set-selections

I guess it isn't the best fix but it worked for me. I can open a PR for this update if maintainers want.

Thanks for the repository and scripts, it helped me prepare tiny test databases!

Custom GEOIP-ASN is not working with logstash.

Hi Team,

I have followed the blog to create a "Geoip-ASN" database and tried to used with logstash geoip filter but no luck.

I could get the values from DB using read-script as provided in the repo. The code to generate ASN is given below.

#!/usr/bin/env perl

use strict;
use warnings;
use YAML::XS 'LoadFile';
use Data::Dumper;
use feature qw( say );
use local::lib 'local';
use Net::Works::Network;
use MaxMind::DB::Writer::Tree;


# Your top level data structure will always be a map (hash).  The MMDB format
# is strongly typed.  Describe your data types here.
# See https://metacpan.org/pod/MaxMind::DB::Writer::Tree#DATA-TYPES

my %types = (
    autonomous_system_number         => 'uint32',
    autonomous_system_organization   => 'utf8_string',
);


my $tree = MaxMind::DB::Writer::Tree->new(

    # "database_type" is some arbitrary string describing the database.  At
    # MaxMind we use strings like 'GeoIP2-City', 'GeoIP2-Country', etc.
    database_type => 'Geo-ASN',

    # "description" is a hashref where the keys are language names and the
    # values are descriptions of the database in that language.
    description =>
        { en => 'ASN Database', },

    # "ip_version" can be either 4 or 6
    ip_version => 4,

    # add a callback to validate data going in to the database
    map_key_type_callback => sub { $types{ $_[0] } },

    # "record_size" is the record size in bits.  Either 24, 28 or 32.
    record_size => 24,
);
my $config = LoadFile('../files/Network_asn.yml');

# Perl hash to store YAML content
my %address_of_network;

my $file_name = 'TestASN.mmdb';
# Output file for mmdb creation

my $output_file = '../files/' . $file_name;

for (keys %{$config}){
  my @org_network; #network might be an array
  my $org_asn;
  my $org_name;
  my $org_network;

  $org_name = $_;
    #say "Org Name $org_name\n";
    for (keys %{$config->{$org_name}}) {
      $org_asn = $config->{$org_name}->{asn};
      if ( exists ($config->{$org_name}->{network})){
        @org_network = @{$config->{$org_name}->{network}};
        #say " The Network : @org_network\n";
      }
    }
    if (@org_network){
      for (@org_network){
        $org_network = $_;
        #print "$org_name\'s asn is \"$org_asn\" and has following networks: $org_network";
        %address_of_network = (
            $org_network => {
               autonomous_system_number => $org_asn,
               autonomous_system_organization => $org_name,
            }
        );
        for my $address ( keys %address_of_network ) {
             my $network = Net::Works::Network->new_from_string( string => $address );
             $tree->insert_network( $network, $address_of_network{$address} );
          }
      }
    }
}


# Write the database to disk.
open my $fh, '>:raw', $output_file;
$tree->write_tree( $fh );
close $fh;

say "$file_name has now been created"

The YAML file consist :

---
team1:
  asn: 0011
  network: 
      - 10.10.1.1
      - 10.10.2.1
team2:
   asn: 0012
   network:
       - 20.20.1.1

If you require any further information please do let me know.

Best,
Yash

Iterate_search_tree goes past node count?

I come from a Python background. This is my first attempt at coding in perl. But I modified the 03-iterate-search-tree.pl example to iterate the database. My goal was trying to time how long it would take to iterate through the database. I've attached my modified script.

#!/usr/bin/env perl

use strict;
use warnings;
use feature qw( say );
use local::lib 'local';

use Data::Printer;
use MaxMind::DB::Reader;
use Net::Works::Address;

my $filename = shift @ARGV or die 'Usage: perl traverse.pl [database_file]';

my $reader = MaxMind::DB::Reader->new( file => $filename );
say $reader->metadata()->metadata_to_encode()->{node_count};
my $count = 0;

$reader->iterate_search_tree(
    sub {
        my $ip_as_integer = shift;
        my $mask_length   = shift;
        my $data          = shift;

        my $address = Net::Works::Address->new_from_integer(
            integer => $ip_as_integer );
        # say join '/', $address->as_ipv4_string, $mask_length;
        # say np $data;
        if(($count % 100000) == 0)
        {
            say $count;
        }
        $count++;
    }
);

My command to run the script.

tuan$time perl traverse.pl GeoLite2-ASN.mmdb
692220
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1100000

According to this, the database has 692220 nodes. But my count is well over 1 million, and legend has it that the script is still running.

Did I messed up on my script?

Can't locate MaxMind/DB/Writer/Tree.pm

Hi,

I installed the latest Vagrant 2.0.0, cloned this repository and followed the instructions:

vagrant up
vagrant ssh
cd /vagrant
perl examples/01-getting-started.pl

However, I get the following error:

Can't locate MaxMind/DB/Writer/Tree.pm in @INC (@INC contains: /etc/perl /usr/local/lib/perl/5.14.2 /usr/local/share/perl/5.14.2 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.14 /usr/share/perl/5.14 /usr/local/lib/site_perl .) at examples/01-getting-started.pl line 7.
BEGIN failed--compilation aborted at examples/01-getting-started.pl line 7.

I'm on Windows 10 and since it's all in Vagrant, idk. what other information would be helpful.

Inserting each IP in the range into mmdb

Background

The blog post - https://blog.maxmind.com/2015/09/building-your-own-mmdb-database-for-fun-and-profit/ goes as below:

"As in our first example, we’re create a new Net::Works::Network object. However, in this case we are going to insert each individual IP in the range. The reason for this is that we don’t know if our IP ranges match the ranges in the GeoLite2 database"

Code snippet from the post: (The Mashup)

`for my $range ( keys %address_for_employee ) {

my $user_metadata = $address_for_employee{$range};

# Iterate over network and insert IPs individually
my $**network** = Net::Works::Network->new_from_string( string => $**range** );
my $**iterator** = $network->iterator;

while ( my $**address** = $iterator->() ) {
    my $ip = $address->as_ipv4_string;
    my $model = $reader->city( ip => $ip );

    if ( $model->city->name ) {
        $user_metadata->{city} = $model->city->name;
    }
    if ( $model->country->name ) {
        $user_metadata->{country} = $model->country->name;
    }
    if ( $model->location->time_zone ) {
        $user_metadata->{time_zone} = $model->location->time_zone;
    }
    **$tree->insert_network( $**network**, $user_metadata );**
}

}`

Issue :
Neither $ip, nor $address was used to insert data into the tree, which goes against the statement - "we are going to insert each individual IP in the range".
Rather $network has been used

Also, as per the documentation - https://metacpan.org/pod/MaxMind::DB::Writer::Tree the method used for inserting into the tree - insert_network takes CIDR as argument, but not individual IP as argument.

How to write individual IP into the tree?

Looking up an IP doesn't get most specific result

I'm writing 2 networks into an mmdb 8.0.0.0/12 and 8.8.8.0/24 with 2 different values. When I lookup the IP 8.8.8.8 from that mmdb it gives me the value associated with 8.0.0.0/12 and not 8.8.8.0/24.

Here is the script I used and the lookup in Python;

#!/usr/bin/env perl

use strict;
use warnings;
use feature qw( say );

use MaxMind::DB::Writer::Tree;

my $filename = 'users.mmdb';

my %types = (
    asn             => 'utf8_string',
);

my $tree = MaxMind::DB::Writer::Tree->new(

    database_type => 'My-IP-Data',
    description =>
        { en => 'My database of IP data', fr => "Mon Data d'IP", },
    ip_version => 4,
    map_key_type_callback => sub { $types{ $_[0] } },
    record_size => 24,
);

my %asn_data = (
    '8.0.0.0/12' => {
        asn          => '3356',
    },
    '8.8.8.0/24' => {
        asn          => '15169',
    },
);

for my $network ( keys %asn_data ) {
    $tree->insert_network( $network, $asn_data{$network} );
}


# Write the database to disk.
open my $fh, '>:raw', $filename;
$tree->write_tree( $fh );
close $fh;

say "$filename has now been created";

Lookup

import maxminddb

asn_file = maxminddb.open_database('users.mmdb')
asn_file.get('8.8.8.8')

The expected behavior is to get the result associated with 8.8.8.0/24.

Custom mmdb file sizes are bigger than geolite though they have less data

I'm testing creating internal mmdbs according to the getting started tutorial.

I'm able to successfully create an mmdb with 1M records and read it from Python.

The only problem is the file size of the mmdb is 75Mb, each IP range has a very simple data field attached to it eg.

8.8.8.8/24 => {'attribute': 'mkbcslbbgiferyergedcqgxmxiesmzuefwdvzfxevawudpiofqczwvzngxrcwhhk'},
1.1.1.1/32 => {'attribute': 'niprztmeflfxaaknfljqkyxmfoslyqzpmdgvrfflzldttodkilttaijbzowefwon'}

The attribute value for every network is a 64 character long string. This is test data but the actual data will average the same length.

The problem is I need to add 14M more records, and if 1M records is 75Mb then 15M will possibly be greater than 1Gb.

How comes the geolite database and geoip city databases have a lot more data but are more compact in size?

Query : Understanding Node Count in mmdb

I created an mmdb with just one subnet using go writer.
Since it is binary search tree in the background, I expected the number of nodes in the search tree to be equal to the prefix length (41 in current example).
However when I examined the mmdb file using mmdbctl tool, the number of nodes are way higher than prefix length (370 in current example).

What is the reason behind creating 370 nodes rather than 41?

Wouldn't creating a Search tree with just the bits in the prefix, be more optimal?

Ouput of mmdbctl commands:

% mmdbctl metadata 1rec.mmdb

  • Binary Format 2.0
  • Database Type
  • IP Version 6
  • Record Size 28
  • Node Count 370
  • Description
  • Languages
  • Build Epoch 1695265326

% mmdbctl export 1rec.mmdb
range,city,country,postal,subdivisions
2001:420:5400::/41,"{""names"":{""en"":""Delhi""}}","{""iso_code"":""IN"",""names"":{""en"":""India""}}","{""code"":""110001""}","[{""names"":{""en"":""Delhi""}}]"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.