This code is playing around with statistics you can calculate using the Levenshein distance and words.
In particular, I'm interested in seeing if we can detect that any particular word is a "scanning error", as well as looking at other metrics.
I think scanno's will have the following properties:
1) They'll have a low frequency(either 1-3 occurences) or a low frequency wrt to the "correct word"
2) They'll have a real word that is within distance 1-2