cameronlonsdale / lantern Goto Github PK
View Code? Open in Web Editor NEWCryptanalysis library for breaking classical ciphers
License: MIT License
Cryptanalysis library for breaking classical ciphers
License: MIT License
This is a big job, it will be done piecewise.
The current hardcoded solution takes too long for examples which can be cracked with only 2 trials. I need to come up with a way to estimate what the best settings are for a given ciphertext. This might be a job for torched (or whatever I'm calling the CLI) I could look at ciphertext length for this metric.
The readme needs updating + I want docs which show how to use the library and example code.
Make dynamicdict have appropriate behaviour for its name. not even a dict right now
Update the vigenere example text to showcase the diversity better. Investigate that bug why the hackers manifesto with punctuation didnt decrypt the first paragraph properly.
(Fixed! It was due to digits not being removed. Its a messy fix, but it works for now until I can build nicer abstractions in later updates)
Investigate speed, see if its possible to speed anything up
I think the speed is reasonable. 2 seconds for simple sub / vigenere. If I wanted speed I wouldnt use python. I made some small changes to free up some bottlenecks but Im pushing the limits of the language. I might investigate rewriting several key functions in a compiled language in the future
Investigate usability. See if alternating subsitution cipher code can be simplified and made intuitive
Update README to show python3.4 or newer is required (same for docs)
Review documentation to make sure its configured properly
Setup readthedocs for stable and development releases
Update version to 0.1 and upload to pypi
Currently scoring functions get run in sequence, however for certain functions, this process will take too long.
I would like to support an option to only run a subsequent function if there is not a clear correct decryption. Ways to do this could be to look at the scoring distribution for the decryptions and running more scoring rounds if the values are too close together.
quipquip.com puts whitespace in the resulting plaintext for you even if there are no whitespace hints. I'd like to add this feature.
LanguageFrequency and LanguageNgrams is the same thing, why do I need 2. Need to refactor into the same bit of code but the nicest name that its understandible how It can be used.
need to think about the range of values that can result from a fitness function, specifically in relation to the corpus problem
This function is used very often when decrypting vigenere, and its slowness has impacted the overall speed of vigenere. Need to look into speeding it up, perhaps with the help of itertools, or even writing a snippet of compiled C/D/Rust/whatever
Currently I predefine several ngram classes so you can import and go, but the issue is that when you import one, every one of them has to load, which slows down the program.
I need to re write this code so that when you want quadgrams, you only get quadgrams
The key should be returned with each decryption. Rather than using a tuple and be coupled to the order, we should make a lightweight class to group a decryption attributes
Substitution cipher's current implementation is quite slow, I would like this to be sped up as much as is possible in python
Its not essential that init and call are documented, It might be better to just have examples instead
While its not necessarily needed for cryptanalysis, for test purposes and building challenges it would be nice to be able to encrypt within the same framework aswell.
There is an issue with chi_squared where by you can be penalised easily for silly mistakes. For instance trying to score the text "abc" against english.unigrams wont work because the unigrams are capital letters. I need to find a way that users arent penalised for this, currently they are. So RIP.
Similar issue to this is the zero division error in chi squared if source_len is 0. Which can happen when you source freq map has no similar characters to the target. In which case, an Error should probably be thrown.
Id like to be able to use the shift decryptor to automate crack a byte shifted image by checking for the correct header bytes at the start of the image.
Not sure why, but this does not get decrypted properly
N VVWJN ENB A PWINRG ITNNG WE VEEKRWAEG VW FRCQJL WICJN. GPR OUAKGROAA BO A AQASA VVPUUQMQ NSCQBWATM, QNCRXGROA, IAM SHZCAIFM NCTNKXB. TUMVA CBDRAT ZMGQOQA BO WNOVWG VZENGHTNA WNZSJRR ERAE QMRVEQ LVBHBVBAAOTR JNQ JRWENBU CHR PBWOE WS CHR ANVUEIV. CHBCTQ SUQAXBV XEXPRZ, NB SCMPRAYTL CRNQAND FXVNS NVQ VEEKRWAEQRB, ACXRJRRL VW TUM 15GQ CRVGDRL LHAIAO GQE FMAPOXC CNRVWQ (15CHβ17GP PNNGCEREF), IACEPMQNNGA ZJY UIIN EKQFCEQ IF NAETL JS GPR 12CH PMACUEG."
Also, you should get better error checking for when ChiSquared does not match the text thats provided.
Using ngram analysis simple substitution works well for ciphertexts of length 250 characters or more. Current thought is to try a corpus along side ngram to check for word existance however the current implementation of corpus might need some modification to deal with non whitespace hint
LanguageFrequency and LanguageNgrams is the same thing, why do I need 2. Need to refactor into the same bit of code but the nicest name that its understandable how It can be used.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.