Git Product home page Git Product logo

locus's Introduction

locus

Hex downloads License Erlang Versions CI status Latest version API reference Last commit

locus is library for Erlang/OTP and Elixir that allows you to pinpoint the country, city or ASN of IP addresses using MaxMind GeoIP2 and other providers.

The databases will be loaded on-demand and, when retrieved from the network, cached on the filesystem and updated automatically.

⚠️ For instructions on how to upgrade to 2.x, check MIGRATION.md

Usage

1. Configure your license key

Skip this step if you're not loading databases directly from MaxMind.

Get a free license key from MaxMind if you haven't one already. Once logged in, you'll find the page to generate it on the left menu, under "My License Key".

Then clone the repository, run make shell and declare your key:

application:set_env(locus, license_key, "YOUR_LICENSE_KEY").

2. Start the database loader

ok = locus:start_loader(country, {maxmind, "GeoLite2-Country"}).
% You can also use:
% * an HTTP(S) URL,
% * or a local path, e.g. "/usr/share/GeoIP/GeoLite2-City.mmdb"
% * or a {custom_fetcher, Module, Args} tuple, with Module
%   implementing the locus_custom_fetcher behaviour.

3. Wait for the database to load (optional)

{ok, _DatabaseVersion} = locus:await_loader(country). % or `{error, Reason}'

4. Look up IP addresses

% > locus:lookup(country, "93.184.216.34").
% > locus:lookup(country, "2606:2800:220:1:248:1893:25c8:1946").

% * '{ok, Entry}' in case of success;
% * 'not_found' if no entry was found
% * '{error, _}' if something bad happened

{ok,#{<<"continent">> =>
          #{<<"code">> => <<"NA">>,
            <<"geoname_id">> => 6255149,
            <<"names">> =>
                #{<<"de">> => <<"Nordamerika">>,
                  <<"en">> => <<"North America">>,
                  <<"es">> => <<"Norteamérica"/utf8>>,
                  <<"fr">> => <<"Amérique du Nord"/utf8>>,
                  <<"ja">> => <<"北アメリカ"/utf8>>,
                  <<"pt-BR">> => <<"América do Norte"/utf8>>,
                  <<"ru">> => <<"Северная Америка"/utf8>>,
                  <<"zh-CN">> => <<"北美洲"/utf8>>}},
      <<"country">> =>
          #{<<"geoname_id">> => 6252001,
            <<"iso_code">> => <<"US">>,
            <<"names">> =>
                #{<<"de">> => <<"USA">>,
                  <<"en">> => <<"United States">>,
                  <<"es">> => <<"Estados Unidos">>,
                  <<"fr">> => <<"États-Unis"/utf8>>,
                  <<"ja">> => <<"アメリカ合衆国"/utf8>>,
                  <<"pt-BR">> => <<"Estados Unidos">>,
                  <<"ru">> => <<"США"/utf8>>,
                  <<"zh-CN">> => <<"美国"/utf8>>}},
      <<"registered_country">> =>
          #{<<"geoname_id">> => 6252001,
            <<"iso_code">> => <<"US">>,
            <<"names">> =>
                #{<<"de">> => <<"USA">>,
                  <<"en">> => <<"United States">>,
                  <<"es">> => <<"Estados Unidos">>,
                  <<"fr">> => <<"États-Unis"/utf8>>,
                  <<"ja">> => <<"アメリカ合衆国"/utf8>>,
                  <<"pt-BR">> => <<"Estados Unidos">>,
                  <<"ru">> => <<"США"/utf8>>,
                  <<"zh-CN">> => <<"美国"/utf8>>}}}}

Documentation

  1. Supported File Formats
  2. Database Types and Loading
  3. Database Validation
  4. Remote sources: Downloading and Updating
  5. Remote sources: Caching
  6. Local sources: Loading and Updating
  7. Logging
  8. Event Subscriptions
  9. API Reference
  10. Tested Setup
  11. License
  12. Alternative Providers
  13. Alternative Libraries (Erlang)
  14. Alternative Libraries (Elixir)

Supported File Formats

  • gzip-compressed tarballs (.tar.gz, .tgz)
  • plain tarballs (.tar)
  • MMDB files (.mmdb)
  • gzip-compressed MMDB files (.mmdb.gz)

For tarball files, the first file to be found within it with an .mmdb extension is the one that's chosen for loading.

The implementation of MaxMind DB format is complete except for the data cache container data type.

Database Types and Loading

  • The free GeoLite2 Country, City and ASN databases were all successfully tested; presumably locus can deal with any MMDB database that maps IP address prefixes to arbitrary data
  • The databases are loaded into memory (mostly) as is; reference counted binaries are shared with the application callers using persistent_term, and the original binary search tree is used to lookup addresses. The data for each entry is decoded on the fly upon successful lookups.

Database Validation

Databases, local or remote, can have their compatibility validated through the locus:check/1 function after they've been loaded (see function reference.)

Alternatively, they can also be checked from the command line by use of the locus CLI utility:

  1. Run make cli to build the script, named locus, which will be deployed to the current directory.

  2. Check the database:

    ./locus check GeoLite2-City.mmdb
    # Loading database from "GeoLite2-City.mmdb"...
    # Database version {{2019,11,6},{11,58,0}} successfully loaded
    # Checking database for flaws...
    # Database is wholesome.

The script will exit with code 1 in case of failure, and 0 otherwise.

Warnings can produce failure through the --warnings-as-errors flag.

Run ./locus check --help for a description of supported options and arguments.

Remote sources: Downloading and Updating

  • The downloaded database files, when compressed, are inflated in memory
  • For MaxMind and HTTP downloads, the last-modified response header, if present, is used to condition subsequent download attempts (using if-modified-since request headers) in order to save bandwidth
  • The downloaded databases are cached on the filesystem in order to more quickly achieve readiness on future launches of the database loader
  • Database download attempts are retried upon error according to an exponential backoff policy - quickly at first (every few seconds) but gradually slowing down to every 15 minutes. Successful and dismissed download attempts will be checked for update after 6 hours. Both of these behaviours can be tweaked through the error_retries and update_period loader settings (see function reference.)
  • When downloading from a MaxMind edition or an HTTP URL, the remote certificate will be authenticated against a list of known Certification Authorities and connection negotiation will fail in case of an expired certificate, mismatched hostname, self-signed certificate or unknown certification authority. These checks can be disabled by specifying the insecure loader option.

Remote sources: Caching

  • Caching is a best effort; the system falls back to relying exclusively on the network if needed
  • By default a caching directory named locus_erlang is created under the 'user_cache' basedir
  • A cached database is named after either:
    • the MaxMind database edition name (when explicitly downloading from MaxMind), or
    • the SHA256 hash of the HTTP(S) URL, or
    • for {custom_fetcher, Module, Args} sources, a filesystem-safe version of Module's name concatenated with the 32-bit erlang:phash2/2 value of the opaque database source as returned by the callbacks.
  • Modification time of the databases is retrieved from either:
    • the last-modified response header (when present, for MaxMind and HTTP(S) sources)
    • the modified_on metadata property for successful locus_custom_fetcher :fetch/1 and :conditionally_fetch/2 callbacks (for databases loaded with locus_custom_fetcher)
  • Caching can be disabled by specifying the no_cache option when running :start_loader
  • The cache database location can be customised by providing {database_cache_file, FilePath} option for locus_loader (FilePath must have a ".mmdb.gz" extension)

Local sources: Loading and Updating

  • The loaded database files, when compressed, are inflated in memory
  • The database modification timestamp is used to condition subsequent load attempts in order to lower I/O activity
  • Database load attempts are retried upon error according to an exponential backoff policy - quickly at first (every few seconds) but gradually slowing down to every 30 seconds. Successful and dismissed load attempts will be checked for update after 30 seconds. Both of these behaviours can be tweaked through the error_retries and update_period loader settings (see function reference.)

Logging

  • Five logging levels are supported: debug, info, warning, error and none
  • The chosen backend is logger if lager is either missing or it hasn't removed logger's default handler.
  • The default log level is error; it can be changed in the application's env config
  • To tweak the log level in runtime, use locus_logger:set_loglevel/1

Event Subscriptions

  • Any number of event subscribers can be attached to a database loader by specifying the {event_subscriber, Subscriber} option when starting the database
  • A Subscriber can be either a module implementing the locus_event_subscriber behaviour or an arbitrary pid()
  • The format and content of reported events can be consulted in detail on the locus_event_subscriber module documentation; most key steps in the loader pipeline are reported (download started, download succeeded, download failed, caching succeeded, loading failed, etc.)

API Reference

The API reference can be found on HexDocs.

Tested setup

  • Erlang/OTP 22 or newer
  • rebar3

License

MIT License

Copyright (c) 2017-2024 Guilherme Andrade

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

locus is an independent project and has not been authorized, sponsored, or otherwise approved by MaxMind.

Alternative Providers

  • IPinfo MMDB databases are compatible with locus since version 2.3.8
  • DB-IP.com: lite databases seem to work but setting up auto-update for them is not practical, as there's no "latest" official URL.

Alternative Libraries (Erlang)

  • egeoip: IP Geolocation module, currently supporting the MaxMind GeoLite City Database
  • geodata2: Application for working with MaxMind geoip2 (.mmdb) databases
  • geoip: Returns the location of an IP address; based on the ipinfodb.com web service
  • geolite2data: Periodically fetches the free MaxMind GeoLite2 databases
  • ip2location-erlang: Uses IP2Location geolocation database

Alternative Libraries (Elixir)

  • asn: IP-to-AS-to-ASname lookup
  • freegeoip: Simple wrapper for freegeoip.net HTTP API
  • freegeoipx: API Client for freegeoip.net
  • geoip: Lookup the geo location for a given IP address, hostname or Plug.Conn instance
  • geolix: MaxMind GeoIP2 database reader/decoder
  • plug_geoip2: Adds geo location to a Plug connection based upon the client IP address by using MaxMind's GeoIP2 database
  • tz_world: Resolve timezones rom a location efficiently using PostGIS and Ecto

locus's People

Contributors

bszaf avatar dependabot[bot] avatar g-andrade avatar kianmeng avatar paulo-ferraz-oliveira avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

locus's Issues

How to use the database provided by the OS?

My CentOS provides an update-to-date DB: /usr/share/GeoIP/GeoLite2-City.mmdb. But locus seems to want a tar ball instead. Is it possible to make it support system-provided DB which simplifies deployment a lot? Thanks.

Logo design for locus

Hello! I am a graphic designer. I contribute to open source software via logo design. If you want, i can design a logo for locus. Actually, I have a my logo idea. If you have an idea, you can say it.

I will waiting feedback. Have a nice day!

IPv4-in-IPv6 root node is wrong

The definition in question:

% https://en.wikipedia.org/wiki/IPv6#IPv4-mapped_IPv6_addresses
-define(IPV4_IPV6_PREFIX, <<0:80, 16#FFFF:16>>).

Following tentative PR #39 a few months ago and, more recently, evidence of this in #43 .

As per the spec, the root node seems to be ::/96 rather than ::ffff:0:0/96 (which is for IPv4-mapped IPv6 addresses) and I had read it wrong all this time:
https://maxmind.github.io/MaxMind-DB/#ipv4-addresses-in-an-ipv6-tree

Official Python implementation mentioned by @nickjacob in #39, which further suggests this in the way it walks over the IPv6 tree to get the IPv4 root node:
https://github.com/maxmind/MaxMind-DB-Reader-python/blob/b59335627a27b96c6e5a3178632ed0ab77d53cfb/maxminddb/reader.py#L118-L127

This being true, the existing behavior worked purely out of chance: all the databases I had tested so far happened to account for IPv4-mapped IPv6 addresses, and therefore I didn't run into the bug for 6+ years.

Changes to GeoLite2 license and distribution require changes to this project

The direct download mechanism that you are currently using will no longer function after December 30th 2019.

Due to upcoming data privacy regulations, MaxMind are making significant changes to how users access free MaxMind GeoLite2 databases starting December 30, 2019. The databases will continue to be available without charge and for redistribution. However, you will be required to create an account and use a license key to download the databases, and agree to a new EULA that addresses applicable data privacy regulations.

Learn more on the MaxMind blog: https://blog.maxmind.com/2019/12/18/significant-changes-to-accessing-and-using-geolite2-databases/.

Mark Fowler
MaxMind

"latest" URL of DB-IP database

Ability to pass a function as a database source

Hi there!

We are interested in building our own geolocation db and hosting the mmbd file in S3.

We would like to make this S3 object private and our application already has its own AWS credentials, which it can use to access the private S3 object. I realize that I could write some code to use these credentials and download the mmdb to the local filesystem, but this application runs in kubernetes and we would rather not introduce volumes if it is not necessary.

Therefore, I am proposing locus be extended to allow an {M,F,A} tuple to be passed as the db source. The function could return any of the types accepted by locus:start_loader/2.

I would be happy to implement this functionality if you agree that it is probably the best way to handle this kind of scenario.

Thanks for reading!

rebar3 warning in elixir projects

First off, thanks for the great library! I have tried out several BEAM geoip libraries, and locus is by far my favorite and the most performant one I have found.

I am using locus in an Elixir project, and am constantly getting the nice big warning:

        !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
        !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
        *********************************************************************

        [locus] You're **strongly** incentivized to use `rebar3`.
        Compatibility with rebar 2 is unmaintained and will be removed in the near future.

        *********************************************************************
        !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
        !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

from both locus and tls_certificate_check.

You probably do not use Elixir yourself, but I was hoping to begin a discussion on how we could modify the check so that locus can be used within Elixir projects without emitting these warnings.

Corrupted DB download when packet loss on network

Branch: 1.9.0-beta
Erl: 20.3

When deploying we've noticed a number of nodes with crashes caused by corrupted GeoLite2-City.mmdb.gz downloads and it has been quite difficult to replicate

After trial and error playing with netem to introduce network issues we seem to be able to consistently replicate when introducing packet loss.

I'm not sure if this is a bug in locus or in httpc stream handling as there are no errors received in locus_http_download:handle_httpc_message/2

It runs through the intended stream_start -> stream -> stream_end with no errors, but the resulting data is corrupt

To replicate consistently I used a fairly high packet loss setting:
sudo tc qdisc add dev eth0 root netem loss 25%

to disable after testing use:
sudo tc qdisc del dev eth0 root

diff and console output: https://gist.github.com/leonardb/4d2b1755d13af1e65830b61767d18c68

Recent tarballs fail to extract on OTP 19 or older

E.g., country database version 2019-05-20 ​20:12.

Something is likely to have changed on MaxMind's exporting pipeline and the new tarballs are no longer compatible with old Erlang/OTP versions. Erlang/OTP versions 20 and 21 have no problem.

erl_tar got a big rewrite on OTP 20, which presumably fixed whatever was the issue.

Builds on OTP 17 are broken

==> Verifying dependencies...
===> The registry repository hexpm uses a record format that has been deprecated for security reasons. The repository should be updated in order to be safer. You can disable this check by setting REBAR_NO_VERIFY_REPO_ORIGIN=1
===> Package not found in any repo: getopt 1.0.1
make: *** [cli] Error 1
The command "RUNNING_ON_TRAVIS=yes make travis_test" exited with 2.

Broken code

Looks like the code is broken. I tried both the example and trying loading the local file.

1> URL = "https://geolite.maxmind.com/download/geoip/database/GeoLite2-Country.tar.gz".
"https://geolite.maxmind.com/download/geoip/database/GeoLite2-Country.tar.gz"
2> locus:start_loader(country, URL).
** exception exit: {noproc,
{gen_server,call,
[locus_sup,
{start_child,
{{http_loader,country},
{locus_http_loader,start_link,
[country,
"https://geolite.maxmind.com/download/geoip/database/GeoLite2-Country.tar.gz",
[{event_subscriber,locus_logger}]]},
permanent,5000,worker,
[locus_http_loader]}},
infinity]}}
in function gen_server:call/3 (gen_server.erl, line 214)
in call from locus_sup:start_child/3 (/Users/manu/misc/ex/_build/default/lib/locus/src/locus_sup.erl, line 67)

Run CI on Windows

Tentatively introduced in fcdd530, reverted in 9e39953 due to Elvis complaining about trailing spaces and operator spaces all over the place:

Screenshot at 2021-08-30 00-43-58

My first guess is that either git or something else is injecting \r into the source files and that is causing Elvis to misdetect the extra whitespaces as trailing (but I didn't confirm this in any way.)

[Feature Request] Supporting IPinfo MMDB databases

IPinfo.io also delivers data in the MMDB file format. The difference between MaxMind and DBIP's MMDB database format is that IPinfo uses a "tabular" data format.

Example from the IPinfo IP to Geolocation database:

FIELD NAME EXAMPLE DATA TYPE DESCRIPTION
start_ip 1.253.242.0 TEXT Starting IP address of an IP address range
end_ip 1.253.242.255 TEXT Ending IP address of an IP address range
join_key 1.253.0.0 TEXT Special variable to facilitate join operation
city Yangsan TEXT City of the location
region Gyeongsangnam-do TEXT Region of the location
country KR TEXT ISO 3166 country code
latitude 35.34199 FLOAT Latitude value of the location
longitude 129.03358 FLOAT Longitude value of the location
postal_code 50593 TEXT Postal code of the location
timezone Asia/Seoul TEXT Local time zone

You can try our free databases as well:

The free database provides full accuracy, is updated daily, and combines IPv4 and IPv6 data in a single dataset. The free IP database is licensed under CC-BY-SA 4.0 and permits commercial usage.

Update mechanism

The simple update mechanism uses a storage bucket URI and the access token as a parameter. The MMDB dataset is not zipped and can be directly downloaded:

curl -L https://ipinfo.io/data/standard_privacy.mmdb?token=<YOUR_TOKEN> -o privacy.mmdb

IPinfo also has a checksums API endpoint.

Samples and documentation

Please let me know what you think. Thanks!

HTTPS redirects may fail

Due to the fact that TLS validation options are only specified once - for the original URL - the TLS handshake fails for a redirection URL if its hostname is distinct from the first URL's hostname (excluding certain scenarios involving wildcard certificates.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.