Git Product home page Git Product logo

crackstation-hashdb's Introduction

CrackStation.net's Lookup Tables

Introduction

There are three components to this system:

  1. The indexing PHP script (createidx.php), which takes a wordlist and builds a lookup table index for a hash function and the words in the list.

  2. The indexing sorter program (sortidx.c), which sorts an index created by the indexing script, so that the lookup script can use a binary search on the index to crack hashes.

  3. The lookup script (LookupTable.php), which uses the wordlist and index to crack hashes.

The system is split up like this because PHP provides easy access to many different types of hash functions, but is too slow to sort large indexes in a reasonable amount of time. We are planning to re-write components #1 and #3 in C or C++.

Building and Testing

The PHP scripts to not need to be built. To build the C programs, run make.

To run the tests, just run make test, and then clean up the files the tests created with make testclean.

Indexing a Dictionary

Suppose you have a password dictionary in the file words.txt and you would like to index it for MD5 and SHA1 cracking.

First, create the MD5 and SHA1 indexes:

$ php createidx.php md5 words.txt words-md5.idx
$ php createidx.php sha1 words.txt words-sha1.idx

Next, use the sortidx program to sort the indexes:

$ ./sortidx -r 256 words-md5.idx
$ ./sortidx -r 256 words-sha256.idx

The -r parameter is the maximum amount of memory sortidx is allowed to use in MiB. The more memory you let it use, the faster it will go. Give it as much as your system will allow.

You now have everything you need to crack MD5 and SHA1 hashes quickly.

Cracking Hashes

Once you have generated and sorted the index, you can use the LookupTable class to crack hashes. See test/test.php for an example of how to use it.

Adding Words

Once a wordlist has been indexed, you can not modify the wordlist file without breaking the indexes. Appending to the wordlist is safe in that it will not break the indexes, but the words you append won't be indexed, unless you re-generate the index. There is currently no way to add words to an index without re-generating the entire index.

crackstation-hashdb's People

Contributors

defuse avatar sylwit avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

crackstation-hashdb's Issues

Compiling of sortindex.c gives warnings, doesn't seem to work

When compiling sortidx.c some warnings are being displayed.
The executable hangs, no output is displayed...

$ make
gcc -O2 sortidx.c -o sortidx
sortidx.c: In function ‘main’:
sortidx.c:69:9: warning: format ‘%d’ expects argument of type ‘int’, but argument 2 has type ‘int64_t’ [-Wformat=]
         printf("Invalid buffer size (%d).\n", bufsize);
         ^
sortidx.c:88:9: warning: format ‘%d’ expects argument of type ‘int’, but argument 2 has type ‘int64_t’ [-Wformat=]
         printf("Cannot allocate buffer (%d bytes).\n", bufsize);
         ^
sortidx.c: In function ‘freadIndexEntryAt’:
sortidx.c:325:10: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result [-Wunused-result]
     fread(out->hash, sizeof(unsigned char), INDEX_HASH_WIDTH, file);
          ^
sortidx.c:326:10: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result [-Wunused-result]
     fread(out->position, sizeof(unsigned char), INDEX_POSITION_WIDTH, file);

bug in checksort.c [PULL REQUEST]

Since the current and max structs are not initialized in main(), the checksort program would often tell me an index was not sorted -- but when run over and over, tell me it was sometimes.

Adding memset(&current,0,sizeof(current)) and memset(&max,0,sizeof(max)) fixed this.

If you are still maintaining this, shall I submit a patch? (I also added checks on the fread() calls to quiet down GCC warnings)

NTLM sort takes forever

The sorting of an NTLM index is taking forever. I'm guessing that's because there's tons of passwords with the same 7-character prefix in a row, and it's causing quicksort to run in n^2 time instead of nlogn.

PHP Fatal error: Out of memory

After downloading crackstation.txt.gz and extracting to realuniq.lst, executing the following command

php createidx.php md5 realuniq.lst realuniq-md5.idx

generates fatal error of

So far, completed 99100000 lines (1.255GB) ...
PHP Fatal error: Out of memory (allocated 4194304) (tried to allocate 2098451 bytes) in /cygdrive/e/crackstation-hashdb-master/createidx.php on line 74

Support partial matches

To use this code on crackstation.net it needs to support partial (prefix) matches. Add a new constant, which is the number of leading bytes to compare. Only compare that many bytes in hashcmp. Then, make $results no longer an array, but an array of pairs ($word, parial/full).

So the overall process for cracking a hash is:

For each supported hash type:
    $results = CrackForThatType($target_hash)
    foreach ($results as $result) {
          output a result $result.word with $results.partial status
    }

LM code is insane

... the IV is unnecessary and instead of doing loop with the $i < $len ? ... thing you can just pad the input out to 14 characters with null bytes. Using less arrays and stuff will probably speed it up too.

crack hash new create database

I was able to create a database
php createidx.php md5 words.txt words-md5.idx

Then I sorted it out
./sortidx -r 4048 words-md5.idx

Checked via test.php

But how do I specify the hash to check, for example: apple (md5: 1F3870BE274F6C49B3E31A0C6728957F)

php test.php md5
Successfully cracked [apple].
Successfully cracked [apple] (as partial match).

How to specify 1F3870BE274F6C49B3E31A0C6728957F to find an apple?

Sortidx tries to load the entire DB into memory instead of just sections, making the -r parameter irrelevant.

I have a database with 630,000,000 entries, and a modified HashDB and sortidx that uses a 4-byte hash instead, bringing us down to 10 bytes/hash.

Small databases are sorted just fine, and the memory increases to the set limit.

However, when I try to process my large database, the memory usage never goes above 0.2Mb.
I have added WinX64 support to sortidx to see if it made any difference on Windows, but the memory usage is still at 0.2mb at most.

SortIdx has been running for 2 days now, and I'm not sure if it has made any real progress.
For reference, the database took about 10 minutes to generate.

Got any idea?

Lazy Mode

Possible Feature: Return as soon as the first partial or non-partial match is found. This would make the complexity of running a query more predictable, and prevent DoS on crackstation.net when tons of matches are returned (e.g. in the case of LM with a common prefix).

Unfortunately, it's not a good idea to support the better version of this feature, which is "return a full match if it is found, otherwise return the first partial match" because to find the full match you have to scan through all of the partial matches anyway, so we might as well return them (and let the caller decide to disregard them).

NTLM bug?

On crackstation.net, try to crack...

0cb6948805f797bf2a82807973b89537
0e8231621f574d3636255ff36dd86c9c

The first one gives yellow and blank output (should be test), second one is correctly cracked as test2. Maybe it just happens to collide?

Weakpass Sort Error

I tried createidx for weakpass2 full collection. It took about a day to run sort.

However, when I check sort, it said the idx file was not sorted.

Can anyone give me an advice?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.