View Code? Open in Web Editor
NEW
This project forked from jhave /upst
Useful Python Scripts for Texts - read more about how I use these in class at http://libbyh.com/2014/10/16/introducing-text-analytics-to-undergraduates/
Python 99.49%
Jupyter Notebook 0.51%
upst's People
Contributors
upst's Issues
Need to strip Project Gutenberg front matter from texts. Related to #1
Currently works for just one word at a time. Add support for word lists so that it's easier to compare word dispersions.
Project Gutenberg texts have line breaks within paragraphs and then an extra empty line between paragraphs. Paragraph counts from stats.py are off.
Pretty sure it's not detecting Alternate capitalization or spelling of "chapter"