Git Product home page Git Product logo

twitter-jsonl-tools's Introduction

twitter-jsonl-tools

Simple set of Python tools for handling Twitter data in the JSONL (JSON Lines text) format, also called newline-delimited JSON. See http://jsonlines.org for more details on the format.

Dependencies

Tested with Python 3.6, and requiring the following packages, which are available via PIP:

For faster JSON decoding, the scripts optionally support the use of the UltraJSON package.

Basic Usage: Tweets

The tools used to process tweets expect one or more JSONL files as inputs, where each line contains a JSON-formatted tweet as retrieved from the Twitter API.

To get basic summary statistics for a JSONL file containing tweets:

python jsonl-tweet-stats.py sample/sample-tweets-500.jsonl

To get list of the most frequently-tweeting users for a JSONL file containing tweets:

python jsonl-tweet-authors.py sample/sample-tweets-500.jsonl 

To get list of the most frequently-mentioned users for a JSONL file containing tweets:

python jsonl-tweet-mentions.py sample/sample-tweets-500.jsonl 

To get list of the most frequently-appearing hashtags for a JSONL file containing tweets:

python jsonl-tweet-hashtags.py sample/sample-tweets-500.jsonl

To export all tweets to a simple CSV (comma-separated) format:

python jsonl-tweet-export.py sample/sample-tweets-500.jsonl -o sample/sample-tweets.csv

To generate a tab-separated CSV file containing pairwise hashtag cooccurrence frequencies for one or more JSONL files:

python jsonl-hashtag-cooccur.py sample/sample-tweets-500.jsonl -o hashtag-cooccurrences.csv

Basic Usage: Users

The tools used to process user data expect one or more JSONL files as inputs, where each line contains a JSON-formatted user profile data as retrieved from the Twitter API.

To export all user metadata to a simple CSV (comma-separated) format:

python jsonl-user-export.py sample/sample-users-50.jsonl -o sample/sample-users.csv

twitter-jsonl-tools's People

Contributors

derekgreene avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

twitter-jsonl-tools's Issues

Changed of twitter's tweet format data

some line from the jsonl-tweet-export.py file are

			sdate = parse_twitter_date(tweet["created_at"]).strftime("%Y-%m-%d %H:%M:%S")
			values = [ fmt_id(tweet["id"]), sdate, norm(tweet["user"]["screen_name"], sep).lower(), fmt_id(tweet["user"]["id"]), norm(tweet["text"], sep) ]
			fout.write("%s\n" % sep.join(values) )

while the new ones are
'_type','url','date','content','renderedContent','id','user','replyCount','retweetCount','likeCount','quoteCount','conversationId','lang','source','sourceUrl','sourceLabel','outlinks','tcooutlinks','media','retweetedTweet','quotedTweet','inReplyToTweetId','inReplyToUser','mentionedUsers','coordinates','place','hashtags','cashtags'
(based on snscrape)

or any desired column

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.