Git Product home page Git Product logo

Comments (4)

shoubhik avatar shoubhik commented on July 22, 2024

Looking at the latest code, it seems that whenever Dataframe is created backed by MapDB, it creates a temp DB file. The temp files get deleted whenever the JVM shuts down.

However, there are many use cases where one would like to deserialize a large dataset into a disk backed map. Then they may later try out multiple algorithms on that dataframe. This is especially helpful during experimentation phase if the input dataset takes long to translate into a Dataframe.

What are you thought? If implemented what should the design look like? I can contribute this features if it makes sense.

from datumbox-framework.

datumbox avatar datumbox commented on July 22, 2024

What you describing makes sense but unfortunately it is not currently supported. A way around it would be to store the records externally from the Dataframe and then read them and load them back to the Dataframe when necessary.

Adding the feature is not that straightforward, I'll need to check out the design as you said. I'll keep you posted.

from datumbox-framework.

datumbox avatar datumbox commented on July 22, 2024

This feature is implemented in the experimental branch. It will be released on version 0.8.0.

from datumbox-framework.

datumbox avatar datumbox commented on July 22, 2024

All the tests passed and the feature was added to the develop branch. A stable release is expected within a couple of weeks.

from datumbox-framework.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.