Comments (3)
Thanks for the suggestion, but I am not sure to understand as images are already loaded on the fly. During the dataset initialization, only the image paths are loaded in memory (src.datasets.segmentation
line45). Images are then loaded on the fly at each __getitem__
call
from docextractor.
So does it load a batch of files and train, then it flushes them from memory, and then load another batch of files?
This is important because i might have a huge dataset, and how memory is utilized is very important.
from docextractor.
yes that's what it does following standard pytorch dataloader routines. So I close the issue for now, please reopen in the case you meant something elese
from docextractor.
Related Issues (20)
- Trying to train a Text Region detector but failed HOT 6
- Training a Text-Line detector and want to create annotations with x-height+ border automatically HOT 3
- The process of GT generation HOT 2
- via_converter.py generate with boarders HOT 3
- error HOT 6
- Problem with PolynomialLR HOT 5
- Post-processing step HOT 2
- bug -- tester.py HOT 2
- Demo website down HOT 7
- where is the UI? HOT 1
- [suggestion] Store datasets and models in a data archive HOT 1
- conda conflicts HOT 2
- Parallel Prediction? HOT 1
- [bug] translation.exception.TranslateError: No translation get, you may retry HOT 4
- [bug] KeyError: 'filename' HOT 2
- [suggestion] save detected regions as vgg-json HOT 1
- [suggestion] directly input vgg.json for training from scratch or finetuning HOT 1
- [donate] include FUNDING.yml to accept donations HOT 1
- from line level to word level? HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from docextractor.