Git Product home page Git Product logo

booksoup's Introduction

Booksoup

Booksoup allows you to analyse and traverse your downloaded facebook data, including features such as sentiment analysis and message frequency analysis over time.

Booksoup requires BeautifulSoup4 and TextBlob, and requires matplotlib to run the demo graphs.

Usage

Initialise a new instance of the BookSoup class, passing in the top-level path of your facebook data folder as an argument.

Basic usage

from booksoup import BookSoup

me = BookSoup("facebook-data")

# Get a conversation by name
convo = me.load_conversation("Jane Doe")

# Print participants of the conversation
print(convo.participants)

# Print messages in the conversation
for message in convo.messages:
    print(message.date, message.timestamp, message.name, message.content)

Interaction frequency

It's possible to see how often messages are sent in a specific conversation at each hour of the day using interaction_freq. This returns a dict with each key being an hour in the day, and the corresponding value being the number of messages sent at that time over the history of the conversation.

me = BookSoup("facebook-data")
convo = me.load_conversation("John Smith")

times = convo.interaction_freq()

Using the demo_interaction_frequency.py code, this can be visualised:

Interaction frequency example

Interaction timeline

It's also possible to view how many times a specific person within a conversation sent messages from the beginning to the last point of the conversation using interaction_timeline(name). The following example shows how often I sent messages within a group conversation.

me = BookSoup("facebook-data")
convo = me.load_conversation("Lewis, Andrew, Michelle and 4 others")

times = convo.interaction_timeline(me.name)

Using the demo_interaction_timeline.py code, I can visualise in one graph how often everyone in the conversation spoke by building a separate timeline for each person.

Interaction timeline example

Another example below with one friend over a longer timeline:

Single user example

Sentiment

Booksoup can also perform sentiment analysis. Average sentiment for a user in a specific conversation can be calculated using Conversation.avg_sentiment(name), or a timeline of average sentiment can also be built using Conversation.sentiment_timeline.

convo = me.load_conversation("David Grocer")

# Print the average sentiment of David Grocer in the conversation
print(convo.avg_sentiment("David Grocer"))

# Print the timeline dictionary of my average sentiment in the conversation
print(convo.sentiment_timeline(me.name))

Loading a conversation

A conversation can either be loaded using either the title of the conversation (as in all the previous examples) or the numerical ID of the conversation (the filename of the conversation's html file).

convo = me.load_conversation(40)

Specifying interval duration

In all of the timeline examples, the interval can be specified as either month or day, with the default being month. To switch to daily intervals for timeline operations, set the interval argument, e.g

convo = me.load_conversation("David Grocer", interval="day")

Events

Booksoup can extract and categorise event information. This includes title, description, location, timestamp and a 2-element array containing the latitude and longitude of the event if available.

me = BookSoup("facebook-data")

events = me.load_all_events()

# Events are organised into attending, maybe, declined and no_reply:
for event in events.attending:
    print(event.title, event.description, event.location, event.timestamp, event.latlon)

booksoup's People

Contributors

buroni avatar jakebrowning avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.