Comments (4)
Looking at the latest code, it seems that whenever Dataframe is created backed by MapDB, it creates a temp DB file. The temp files get deleted whenever the JVM shuts down.
However, there are many use cases where one would like to deserialize a large dataset into a disk backed map. Then they may later try out multiple algorithms on that dataframe. This is especially helpful during experimentation phase if the input dataset takes long to translate into a Dataframe.
What are you thought? If implemented what should the design look like? I can contribute this features if it makes sense.
from datumbox-framework.
What you describing makes sense but unfortunately it is not currently supported. A way around it would be to store the records externally from the Dataframe and then read them and load them back to the Dataframe when necessary.
Adding the feature is not that straightforward, I'll need to check out the design as you said. I'll keep you posted.
from datumbox-framework.
This feature is implemented in the experimental branch. It will be released on version 0.8.0.
from datumbox-framework.
All the tests passed and the feature was added to the develop branch. A stable release is expected within a couple of weeks.
from datumbox-framework.
Related Issues (20)
- Access output of StepwiseRegression prediction HOT 1
- Cross Validation in Datumbox for parameter selection HOT 2
- Train Text Classifier from String array HOT 1
- How to Set configs so that I can read Training Data from Disk? HOT 4
- How to use Pretrained Models in Datumbox Framework HOT 3
- Can we perform Named Entity Extraction Using Datumbox HOT 2
- How can to make datumbox train data in disk HOT 1
- Will this work on Android HOT 2
- java.lang.OutOfMemoryError while preparing model from own datasets. HOT 1
- Created model is giving slow response? HOT 2
- FlatDataList with null values gets an exception when trying to calculate the variance HOT 5
- SVM example for text classfication HOT 2
- Unable to download the framework using Maven HOT 1
- WordSequenceExtractor can not work with MultinomialNaiveBayes Training HOT 1
- How to setLogPriors for Naive Bayes model during cross validation? HOT 1
- Why Holt-Winters only returns one-step-ahead forecast ? HOT 2
- How to load a big dataset and use multiple TextClassifier to predict it? HOT 1
- Entity based Sentiment Analysis HOT 1
- Possible Error in Shapiro-Wilk P-Value HOT 9
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from datumbox-framework.