Git Product home page Git Product logo

chatbots-of-me's Introduction

FB chat data chatbots

Two chatbots trained on my Facebook Messenger chat data to talk like me. One uses the Doc2Vec implementation of the Python library gensim and the other is based on the library ChatterBot. As my chat data is not too plentiful (~70k messages sent by me and ~35k decent input-output pairs) the strategy was to get the user input sentence, find the most similar recorded input and return the corresponding recorded output.

I also analysed my chat data, which can be found here.

The bots

Doc2Vec is well suited for this task (and was very performant), but likely needs a corpus much larger than my chat history. I also did not experiment thorougly with parameters such as word vector dimension count. ChatterBot, while preferring to be trained on full conversations, needed to be trained simply on input-output pairs to only learn "character" from me. This may have been one of the reasons for it being much slower. For responses to not take minutes, the bot based on ChatterBot was only trained on 20% of all data. Despite that, its responses generally seemed slightly more on-topic and it was less prone to repeating itself like the bot based on Doc2Vec.

Training Chatting

Running

My chat data and the models trained on it have not been included for obvious reasons. After downloading your own FB data (instructions) (change your FB language to English and the time format to 24h beforehand), place the messages folder in it into the same folder as all the scripts, delete all subfolders of messages, leaving only the html files and run in succession scrape.py, datagen.py, train.py and finally chat.py.

January 2018
Andreas Vija

chatbots-of-me's People

Contributors

andreasvija avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.