More information can be found in my post Tweet-to-tweet learning on http://musella.github.io
-
Tweets are streamed by the tweet_streamer.py script.
-
Data cleaning and preprocessing is performed by the preprocess.ipynb notebook.
-
The script hash_embed.sh trains a GloVe embedding on the dataset hashtags. The embedding can be visualized through the notebook hashtag_embedding.ipynb
-
The sequence generator is trained using train_rnn.ipynb.
-
Words sequences are then generated through
beam search
and served using Flask (flask_app/app.py)