Comments (3)
If I understand it correctly, the file will only be placed into memory if the filesize is less than bufcount.
It seems rater strange that the -r parameter would be relevant if the buffersize has to be over the filesize for it to be loaded into memory at all.
I would assume that instead of trying to load the whole file into memory, it would load a portion of it into memory, process that, and move on to another portion.
from crackstation-hashdb.
The sorting is done with quicksort which works by successively breaking the problem down into sorting problems of approximately half the size. If the file is bigger than the memory limit, it'll need to do the first iterations of quicksort on disk instead of in memory, and once it gets broken down into small enough pieces it'll begin to do the sorting in RAM (which is much faster).
When I built the CrackStation.net databases (which are about 15 billion entries, 1/42 the size of your database) it took about a week with 16GB of RAM... so based on that you'd probably have to improve the sorting code to get your database sorted in a reasonable amount of time.
One idea is to change the code to buffer the left and right halves in RAM and then write them out to disk all at once when the buffer gets full, so that the disk is being read and written to linearly instead of randomly. Another idea is to switch to mergesort, merging whole buffer-sized chunks at a time in RAM. Or maybe there's a fancy sorting algorithm that would do even better.
from crackstation-hashdb.
Thank you for your reply.
The timeframe for 15 billion entries at 16GB RAM puts the estimated timeframe into perspective.
I'm not sure why Sortidx got halted for two days, it might be due to a slow HDD.
I got the sorting time down to about 8 hours with 630 000 000 entries when the entire file was loaded into memory.
An other solution would be to split the database into sections that fits into ram, and then search through all databases in the end, bringing the search-time up a bit, but the sorting time down by quite a lot.
This would also allow me to progressively add to the combined database instead of having to re-sort.
I will look into the suggestions as well.
Thanks a lot!
from crackstation-hashdb.
Related Issues (13)
- bug in checksort.c [PULL REQUEST] HOT 6
- crack hash new create database HOT 9
- ./sortidx -r 256 words-sha256.idx
- PHP Fatal error: Out of memory HOT 1
- Weakpass Sort Error
- Compiling of sortindex.c gives warnings, doesn't seem to work HOT 1
- NTLM bug? HOT 8
- Support partial matches HOT 2
- LM code is insane HOT 2
- NTLM sort takes forever HOT 5
- Check bufcount >= 1? HOT 1
- Lazy Mode HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from crackstation-hashdb.