Git Product home page Git Product logo

fb_messenger_analysis's Introduction

Facebook Message Analysis

Analyze Facebook messages for interesting statistics

Prerequisites

python3 and these libraries:
pip install matplotlib # matplotlib
pip install numpy      # numpy
pip install tabulate   # tabulate

Getting Started

Clone the repo

git clone https://github.com/Strafos/fb_messenger_analysis

Follow instructions here to download your Facebook message data.

Make sure to select "JSON" for format. The quality doesn't matter since we are only looking at texts.

After Facebook processes the request (could take up to a couple hours), unzip the data and put it in the repository

Setup

setup.py will generate friends.py which contains paths to relevant message dumps.

setup.py requires two arguments:

--dir, the directory which has the unzipped facebook message data and

--name, which should be your Facebook name in the format "John Smith"

My setup looks like this

python setup.py --dir data --name "Zaibo Wang"

because my message data directory is in my repository

friends.py will list out the top 50 (by default) most messaged friends in order.

Examples

Private messages

private_message_analysis.py analyzes 1 on 1 messages. All methods are in the main method and commented out by default. Generally, four statistics are supported:

  • Characters: total characters
  • Messages: total times enter is pressed
  • Clusters: all messages sent before being interupted by other participant
  • Words: count of elements split by spaces

The supported time periods are Year, Month, Day

All friends were initialized in friends.py. To access a friend in private_message_analysis.py, use the variable friends.JOHN_SMITH

I used a name hash in the following example outputs so they don't use friends' real names


graph_stat will create a bar graph of a given stat over a period. Default graphs Messages per Year between you and your best friend (most messaged friend).

graph_messages graph_characters

n_top_stat shows the top n people of a certain stat by the period. By default, it is set to the top 4 characters per month (I think this statistic is the most interesting)

n_top_stat

count_links gives an absolute count and ratio of links sent to/received from a person. By default, it only calculates this for the top 20 most messaged friends (I find that after 20, there are few links and the data is not useful)

count_links


generate_averages takes combinations of aforementioned stats (such as Characters per Message) and calculates the average per person over all (top 50) friends.

generate_averages


count_specific_words takes an array of words and a friend to compare word frequency.

count_specific_words

total_stat_sent shows how many of a certain stat you have sent over a period. Default is total Characters per Year.

total_stat_send

Group Messages

group_message_analysis.py has the code to analyze group messages. It is a little tricker to set up. The easiest way to run is to pass a path to a group message.json to the main method in group_message_analysis.py.

I found some difficulty in finding group messages within my message dump so another way to do it is to use find_groupchat() in setup.py. This lets me specify a condition such as all groupchats with more than 15 participants. Then I add them to the GROUPCHAT variable in setup.py which will generate groupchats in friends.py. Then, these paths can be passed into the main method by friends.${chat_name}.

Result: group_chat

fb_messenger_analysis's People

Contributors

chillee avatar strafos avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.