Git Product home page Git Product logo

discordwordcloud's Introduction

DiscordWordCloud

A Python 3.9 discord bot that generates a word cloud (now with emojis !) for each discord user.

How to run it

  • Clone the project wherever you want
  • Install Python 3.9 from https://www.python.org/ (you can probably use an older version too)
  • Install the requirements listed in requirements.txt with pip
  • add a token.txt file at the root of the project
  • go to https://discordapp.com/developers/applications/ create your app
    • add a User Bot to it and paste its Token in token.txt
    • enable SERVER MEMBERS INTENT in the bot tab
    • invite the bot with https://discord.com/api/oauth2/authorize?client_id=CLIENT_ID&permissions=0&scope=bot replace CLIENT_ID with the Client ID of your app
  • run main.py with Python 3.9

And it should run !

If you got any questions about this project, feel free to DM Inspi#8989 on Discord.

Features

  • ;load (days) use this to load messages up to x days on your server
  • ;cloud to generate your word cloud, you can also tag someone in the command to generate their word cloud
  • ;emojis to get the custom emojis usage of the server

Pictures

How it works

Here's a step by step of how the bot makes a wordcloud:

  • after ;load
    • for each word w and user u, compute p(w|u), the probability of u writing w (this data is separated between servers)
  • after ;cloud
    • for each word w and the user u for which the cloud is, compute p(w|u)/p(w), this quantifies how much u favors w compared to everyone else
    • uses the WordCloud lib to make a word cloud image, scaling the words in proportion to their scores (they are placed randomly)

Some notes regarding this model:

  • This model as-is would be filled with typos, bits of url and words that have only been written by u because p(w|u)/p(w) would be high; to counter this, a value α is added to every p(w|u) as if each user had at least an α probability of writing any word. This regularisation has the downside of putting some words that were never written by u in the word cloud.
  • A strong point of this model is that it works in any language and stop-words such as and, the, a, etc. are often excluded because they are not favored by anyone in particular.
  • One limitation of the model is that it does not handle expressions bigger than a single word or emoji.

The code is written so that implementing a new model is easy, by subclassing WCModel in wcmodel.py, and importing your custom class instead of WCBaseline in main.py !

Powered by:

discordwordcloud's People

Contributors

inspirateur avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.