The task was to implement two features:
-
Calculate the total number of times each word has been tweeted.
-
Calculate the median number of unique words per tweet, and update this median as tweets come in.
For this coding challenge, I write two simple programs in order to implement these two features.
Both programs are placed in a directory named src
. The first program words_tweeted.py
outputs the results of first feature to a text file named ft1.txt
in a directory named tweet_output
.
The second program median_unique.py
outputs the results of first feature to a text file named ft2.txt
in a directory named tweet_output
.
Both programs take input tweet file named tweets.txt
from a directory named tweet_input
. A shell script named run.sh
is also given in the main directory, that compiles and runs the programs.
My solution does require two libraries from the python standard library:
-
sys
-
collections