Git Product home page Git Product logo

Comments (5)

UmanShahzad avatar UmanShahzad commented on August 25, 2024

@hyperxpro mmdbctl uses the underlying mmdb-writer library for creating the actual MMDB, which doesn't support incremental writes. The MMDB format itself isn't very friendly to incremental writes, which is the root of the problem. It's probably not impossible, but it'd be a really complex and sensitive algorithm using partial files to help deal with the memory growth (similar to a large-scale merge-sort solution).

In practice, almost anyone producing a big enough MMDB file ends up having large RAM machines to do it, which is cheap these days, so little effort has been invested into optimizing that.

It could be something we do!

from mmdbctl.

hyperxpro avatar hyperxpro commented on August 25, 2024

Can we not write N bytes into the file and flush and increase offset and repeat the process? I am not sure if it is entirely possible like this but, is there any workaround we can do at the I/O level to fix this?

from mmdbctl.

UmanShahzad avatar UmanShahzad commented on August 25, 2024

@hyperxpro unfortunately, the memory and I/O model for MMDBs, and the relationship between them, is more complex than that.

See https://maxmind.github.io/MaxMind-DB for the actual MMDB spec

from mmdbctl.

jhg03a avatar jhg03a commented on August 25, 2024

For reference in my custom mmdb, it takes around 800G of ram and many hours to compile from an optimized ip_trie that takes a fraction of the time and memory to build and is threadsafe. From my shallow dive into it, a large part of it is the fact that the mmdb writer library uses reflection all over the place since it was written before real generics were part of the language.

from mmdbctl.

johnhtodd avatar johnhtodd commented on August 25, 2024

Hitting the same problem here, doing the same thing. MMDB->JSON->MMDB. JSON file has ~15m routes/prefixes. Un-possible thus far on any of the systems we have here (512G is max) to turn back into MMDB.

from mmdbctl.

Related Issues (16)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.